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Examiner: Not Yet Assigned 
Art Unit: Not Yet Assigned 
PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

Prior to examination of the above-referenced application and calculation of 
the filing fee, please enter the following amendments and remarks. 

IN THE TITLE: 

Please delete the current title and substitute therefor: 

-- ARRAYS AND PHOTOLITHOGRAPHIC MEANS FOR THEIR 

MANUFACTURE -- 



IN THE CLAIMS: 

Please cancel claims 1-171 without prejudice. 

Please add the following new claims 172-212: 

—172. (New) A substrate with a surface comprising a plurality of 
polypeptides with different, known sequences bound to the surface in discrete known 
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regions, at a density exceeding 400 different polypeptides occupying a total area of less than 
1 cm 2 on said substrate, said groups of polypeptides having different polypeptide sequences. 

173. (New) The substrate as recited in claim 172, wherein said substrate 
comprises 10 3 or more different groups of polypeptides with known sequences bound to 
discrete known regions of said substrate. 

174. (New) The substrate as recited in claim 172, wherein said substrate 
comprises 10 4 or more different groups of polypeptides with known sequences bound to 
discrete known regions of said substrate. 

175. (New) The substrate as recited in claim 172, wherein said substrate 
comprises 10 5 or more different groups of polypeptides with known sequences in discrete 
known regions. 

176. (New) The substrate as recited in claim 172, wherein said substrate 
comprises 10 6 or more different groups of polypeptides with known sequences in discrete 
known regions. 

177. (New) The substrate as recited in claim 172, wherein said groups of 
polypeptides are at least 50% pure within said discrete known regions. 

178. (New) The substrate as recited in claim 172, it wherein the groups of 
polypeptides are attached to the surface by a linker. 

179. (New) The substrate as recited in claim 172, it wherein the groups of 
polypeptides are covalently attached to the surface. 

180. (New) A substrate with a surface comprising a plurality of polypeptides 
with different, known sequences bound to the surface in discrete known regions, at a density 
exceeding 1000 different polypeptides occupying a total area of less than 1 cm 2 on said 
substrate, said groups of polypeptides having different polypeptide sequences. 
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181. (New) The substrate as recited in claim 180, wherein said substrate 
comprises 10 3 or more different groups of polypeptides with known sequences bound to 
discrete known regions of said substrate. 

182. (New) The substrate as recited in claim 180, wherein said substrate 
comprises 10 4 or more different groups of polypeptides with known sequences bound to 
discrete known regions of said substrate. 

183. (New) The substrate as recited in claim 180, wherein said substrate 
comprises 10 5 or more different groups of polypeptides with known sequences in discrete 
known regions. 

184. (New) The substrate as recited in claim 180, wherein said substrate 
comprises 10 6 or more different groups of polypeptides with known sequences in discrete 
known regions. 

185. (New) An array of more than 1,000 different groups of polypeptide 
molecules with known sequences bound to a surface of a substrate, said groups of 
polypeptide molecules each in discrete known regions and differing from other groups of 
polypeptide molecules in monomer sequence, each of said discrete known regions being an 
area of less than about 0.01 cm 2 and each discrete known region comprising polypeptides of 
known sequence, said different groups occupying a total area of less than 1 cm . 

186. (New) The array as recited in claim 185, wherein said discrete known 
region is less than about lxlO" 2 cm 2 to about lxl0" 5 cm 2 . 

187. (New) The method as recited in claim 186, wherein said discrete 

9 9 4 9 

known region is less than about 1x10" cm to about 1x10" cm . 

188. (New) The method as recited in claim 187, wherein said discrete 

2 2 3 2 

known region is less than about 1x10" cm to about 1x10" cm . 

189. (New) The array as recited in claim 185, made by the process of: 
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(a) providing a polypeptide array comprising at least two different 
polypeptides immobilized on a surface, and wherein said polypeptides are synthesized on 
said surface; 

(b) contacting said surface with a first protected amino acid wherein said 
first protected amino acid is selectively coupled to a functional group in a first selectively 
activated region of said surface; 

(c) contacting said surface with a second protected amino acid without 
physical segregation of said surface such that said second protected amino acid is selectively 
coupled to a functional group in a second selectively activated region of said surface; and, 

(d) repeating the above steps until at least two different polypeptides are 
formed at known locations on said substrate surface. 

190. (New) The array as recited in claim 189, wherein said first selectively 
activated region of said substrate is exposed to light to remove a photoremovable group from 
said first protected amino acid. 

191. (New) The array as recited in claim 185, comprising more than 10,000 
groups of polypeptides of known sequences. 

192. (New) An array of polypeptides, said array of polypeptides comprising: 
a substrate having a surface; and 

a plurality of different polypeptides bound to said surface at a density 
exceeding 400 different polypeptides/cm 2 , wherein each of said plurality of different 
polypeptides is attached to said surface in a different known location of area greater than 100 
square microns, has a different determinable sequence. 

193. (New) The array of claim 192, wherein said substrate is a solid 

support. 

194. (New) The array of claim 193, wherein said substrate is a solid support 
is a member selected from the group consisting of particles, strands, precipitates, gels, sheets, 
tubing, spheres, containers, capillaries, pads, slices, films, plates, slides. 
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195. (New) The array of claim 193, wherein said solid support is made of a 
member selected from the group consisting of polymers, plastics, resins, polysaccharides, 
silica or silica-based materials, carbon, metals, inorganic glasses, and membranes. 

196. (New) The array of claim 193, wherein said solid support is glass. 

197. (New) The array of claim 193, wherein said solid support is a gel. 

198. (New) The array of claim 193, wherein said polypeptides are attached 
to said solid support through a linker group. 

199. (New) The array of claim 193, wherein said array comprises at least 
1,000 different polypeptides attached to said solid support. 

200. (New) The array of claim 193, wherein said array comprises at least 
10,000 different polypeptides attached to said solid support. 

201. (New) The array of claim 193, wherein said plurality of different 
polypeptides attached to said surface are at a density exceeding 1000 different 
polyp eptides/cm 2 . 

202. (New) The array of claim 192, wherein each of said different known 
locations is physically separated from each of the other known locations. 

203. (New) The array of claim 192,wherein said polypeptides in said 
different known locations comprise polypeptides that are at least 20% pure. 

204. (New) The array of claim 192,wherein said polypeptides in said 
different known locations comprise polypeptides that are at least 50% pure. 

1 205. (New) The array of claim 192,wherein said polypeptides in said 

2 different known locations are at least 80% pure. 

1 206. (New) The array of claim 192, said polypeptides in said different 

2 known locations are at least 90% pure. 
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207. (New) The array of claim 192, wherein said array is produced by a 
process comprising: 

providing a planar, non-porous solid support, said solid support having a 
plurality of compounds immobilized on a surface thereof, said compounds having protecting 
groups coupled thereto; deprotecting a first portion of said plurality of compounds on said 
surface and not a second portion of said plurality of compounds; 

reacting said first portion of said plurality of compounds with a first reactant; 

deprotecting at least a third portion of said plurality of compounds on said 
surface, said third portion comprising a fraction of said first portion of said plurality of 
compounds; 

reacting said at least third portion of said plurality of compounds with a 
second reactant; and 

optionally repeating said synthesis steps to produce said polypeptide array. 

208. (New) The array of claim 192, wherein said polypeptides in said 
different known locations are at least 10% pure. 

209. (New) The array of claim 192, wherein said support is rigid. 

210. (New) An array of polypeptides, said array of polypeptides 

comprising: 

a planar rigid support having at least a first surface; and 
a plurality of different polypeptides bound to said first surface of said planar 
rigid support at a density exceeding 400 different polypeptides/cm 2 , wherein each of said 
different polypeptides is attached to said surface of said solid support and has a different 
determinable sequence. 

211. (New) The array of claim 210, wherein said density exceeds 1000 

different polypeptides occupying a total area of less than 1 cm on said substrate. 

212. (New) The array of claim 210, wherein said plurality of different 
polypeptides exceeds 1000 different groups wherein each of said plurality of different 
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polypeptides is attached to said surface in a different known location of area greater than 100 
square microns, has a different determinable sequence. 

REMARKS 

Claims 172-212 are pending in this application and presented for examination. 
Claims 1-171 have been canceled without prejudice or disclaimer. Early examination on the 
merits is respectfully requested. 



THE APPLICATION 

The present application is a continuation of 09/557,875, filed April 24, 2000, which 
is a continuation of 09/056,927 filed April 8, 1998, which is a continuation of 08/670,1 18 filed 
June 25, 1996, now U.S. Patent No. 5,800,992, which is a divisional of 08/168,904 filed December 
15, 1993, which is a continuation of 07/624,1 14, filed December 6, 1990. The present application 
is also a continuation-in-part of USSN 07/362,901, filed June 7, 1989. This application is also a 
continuation-in-part of USSN 08/348,471 filed November 30, 1994, which is a continuation of 
USSN 07/805,727 filed December 6, 1991 (now U.S. Patent No. 5,424,186), which is a 
continuation-in-part of USSN 07/492,462, filed March 7, 1990 (now U.S. Patent No. 5,143,854). 

The specification of the present case differs from that of 07/624,1 14 (the '114 
application") in that portions of certain patent filings cited and incorporated by reference in the 
'114 application have been reproduced in the present application. Specifically, the incorporated 
portions are from US Application Serial Nos.: 07/362,901, 07/492,462 and 07/624,120. These 
applications are all cited in the cross-reference to related applications in the 6 1 14 application. The 
incorporated portions provide further details of methods for array synthesis. Figures from 
incorporated text have been renumbered for conformity with existing text. The cross-reference to 
related applications in the present application has also been updated relative to that in the 4 1 14 
application to reflect issuance of certain patents from cited applications. The cross-reference has 
also been amended to include an additional priority claim to application 07/805,727, filed 
December 6, 1991 and predecessor cases. Further, the title of the present application has been 
amended relative to ' 1 14 to conform to the pending claims. No new matter is involved in any 
amendments. In brief, other than for the noted clerical changes and the update of the cross- 
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reference, the specification of that of the present application is that of the 6 1 14 application and 
prior applications incorporated by reference in the same. Accordingly, it submitted that the present 
application has the same effective filing date as the '114 application (i.e., December 6, 1990), and 
that at least some of the present claims have priority to the earlier '462 and/or '901 predecessor 
applications filed March 7, 1990 and June 7, 1989 respectively. 

CONCLUSION 

In view of the foregoing, Applicants respectfully request early examination on the 
merits. If the Examiner believes a telephone conference would expedite prosecution of this 
application, please telephone the undersigned at 415-576-0200. 

Respectfully submitted, 
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CROSS-REFERENCE TO RELATED APPLICATION 

The present application is a continuation of 09/056,927 filed April 8, 1998, 
which is a-continuation of 08/670,118 filed June 25, 1996, now US 5,800,992, which is a 
divisional of 08/168,904 filed December 15, 1993, which is a continuation of 07/624,1 14, 
filed December 6, 1990 (incorporated by reference). This application is a continuation-in- 
part application of commonly assigned patent applications Pirrung et al., U.S.S.N. 
.07/362,901 (VLSIPS parent) filed on June 7, 1989; and Pirrung et al., U.S.S.N. 
07/492,462 (VLSIPS CIP), filed on March 7, 1990 (now US 5,143,854), which are hereby 
incorporated herein by reference. This application is also a continuation-in-part of USSN 
08/348,471 filed November 30, 1994, which is a continuation of USSN 07/805,727 filed 
December 6, 1991 (now US 5,424,1 86), which is a continuation-in-part of USSN 
07/492,462, filed March 7, 1990 (now US 5,143,854), which is a continuation-in-part of 
USSN 07/362,901, filed June 7, 1989. Additional commonly assigned applications Barrett 
et al., U.S.S.N. 07/435,316 (caged biotin parent) filed November 13, 1989; and Barrett et 
al., U.S.S.N. 07/612,671 (caged biotin CIP), filed November 13, 1990 are also 
incorporated herein by reference. Additional applications Pirrung et al., U.S.S.N. 
07/624,120 (now abandoned) a divisional of which has issued as US 5,744,101 and Dower 
et al., U.S.S.N. 07/626,730 (now US 5,547,839), which are also commonly assigned and 
filed on the same day as this application, are also hereby incorporated herein by reference. 



SEQUENCING BY HYBRIDIZATION OF A TARGET NUCLEIC ACID 
TO A MATRIX OF DEFINED OLIGONUCLEOTIDES 

BACKGROUND OF THE INVENTION • 
The present invention relates to the sequencing, 
fingerprinting, and mapping of polymers, particularly 
biological polymers. The inventions may be applied, for 
example, in the sequencing, fingerprinting, or mapping of 
nucleic- acids , polypeptides, oligosaccharides, and -synthetic 
polymers. 

The relationship between structure and function of 
macromolecules is of fundamental importance in the 
understanding of biological systems. These relationships are 
important to understanding, for example, the functions of 
enzymes, structural proteins, and signalling proteins, ways in 
which cells communicate with each other, as well as mechanisms 
of cellular control and metabolic feedback. 

Genetic information is critical in continuation of 
life processes. Life is substantially inf ormationally based 
and its genetic content controls the growth and reproduction of 
the organism and its complements. Polypeptides, which are 
critical features of all living systems, are encoded by the 
genetic material of the cell. In particular, the properties of 
enzymes, functional proteins, and structural proteins are 
determined by the sequence of amino acids which make them up. 
As structure and function are integrally related, many 
biological functions may be explained by elucidating the 
underlying the structural features which provide those 
functions. ■ For this reason, it has become very important to 
determine the genetic sequences of nucleotides which encode the 
enzymes, structural proteins, and other effectors of biological 
functions. In addition to segments of nucleotides which encode 
polypeptides, there are many nucleotide sequences which are 
involved in control and regulation of gene expression. 

The human genome project is directed toward 
determining the complete sequence the genome of the human 



organism. Although such a sequence- would not correspond to the 
sequence of any specific individual, it vould provide 
significant information as to the general organization and 
specific sequences contained within segments from particular 
individuals. It would also provide mapping information which 
is very useful for further detailed studies. However, the need 
for highly rapid, accurate, and inexpensive sequencing 
technology is n'owhere more apparent than in a demanding 
sequencing project such as this. To complete the sequencing of 
a human genome would require the determination of approximately 
3xl0 9 ror 3 billion base pairs. 

The procedures typically used today for sequencing 
include the Sanger dideoxy method, see, e.g., Sanger et al. 
(1977) Proc. Natl. Acad, Sci . USA . 74:54 63-54 67, or the Maxam 
and Gilbert method, -see, e.g., Maxam et al., (1980) Methods in 
Enzvmoloqy , 65:499-559. The Sanger method utilizes enzymatic 
elongation procedures with chain terminating nucleotides. The 
Maxam and Gilbert method uses chemical reactions exhibiting 
specificity of reaction to generate nucleotide specific 
cleavages. Both methods require a practitioner to perform a 
large number of complex manual manipulations. These 
manipulations usually require isolating homogeneous DNA 
fragments, elaborate and tedious preparing of samples, 
preparing a separating gel, applying samples to the gel, 
electrophoresing the samples into this gel, working up the 
finished gel, and analyzing the results of the procedure. 

Thus, a less expensive, highly reliable, and labor 
efficient means for sequencing biological macromolecules is 
needed. A substantial reduction in cost and increase in speed 
of nucleotide sequencing would be very much welcomed. In 
particular, an automated system would improve the 
reproducibility and accuracy of procedures. The present 
invention satisfies these and other needs. 

SUMMARY -OF THE INVENTION 
The present invention provides improved methods 
useful for de novo sequencing of. an unknown polymer sequence, 
for verification of known sequences, for fingerprinting 



polymers, and for mapping homologous . segments within a 
sequence. By reducing the number of manual manipulations 
required and automating most of the steps, the speed, accuracy, 
and reliability of these procedures are greatly enhanced. 
5 The production of a substrate having a matrix of 

positionally defined regions with attached reagents exhibiting 
known recognition specificity can be used for the sequence 
analysis of a pilymer. Although most directly applicable to 
sequencing, the present invention is also applicable to 
10 fingerprinting, mapping, and general screening of specific 
interactions. The VI>SIPS substrates will be applied to 
evaluating other polymers, e.g., carbohydrates, polypeptides, 
hydrocarbon synthetic polymers, and the like. For these nqn- 
polynucleotides, the sequence specific reagents will usually be 
15 . antibodies specific for a particular subunit sequence. 

The present invention also provides a means to 
automate sequencing manipulations. The automation of the 
substrate production method and of the scan and analysis steps 
minimizes the need for human intervention. This simplifies the 
20 tasks and promotes reproducibility. 

The present invention provides a composition 
comprising a plurality of positionally distinguishable sequence 
specific reagents attached to a solid substrate, which reagents 
are capable of specifically binding to a predetermined subunit 
25 sequence of a preselected multi-subunit length having at least 
■ - three subunits, said reagents representing substantially all 
possible sequences of said preselected length. In some 
embodiments, the subunit sequence is a polynucleotide or a 
polypeptide, in others the preselected multi-subunit length is 
30 five subunits and the subunit sequence is a polynucleotide 
sequence. In other embodiments, the specific reagent is an 
oligonucleotide of at least about five nucleotides. 
Alternatively, the specific reagent is a monoclonal antibody. 
Usually the specific reagents are all attached to a single 
35 solid substrate, and the reagents comprise about 3 000 different 
sequences. In other embodiments, the reagents represents at 
least about 25% of the possible subsequences of said 
preselected length. Usually, the reagents are localized in 
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regions of the substrate having a density of at least 2 5 
regions per square centimeter, and often the substrate has a 
surface area of less than about 4 square centimeters. 

The present invention also provides methods for 
analyzing a sequence of a polynucleotide or a polypeptide, said 
method comprising the step of: 

- a) exposing said polynucleotide or polypeptide 
to a composition as described. 

It also provides useful methods for identifying or 
comparing a target sequence with a reference, said method 
comprising the step of: 

a) exposing said target sequence to a 
composition as described; 

b) determining the pattern of positions of the 
'reagents which specifically interact with 
the target sequence; and 

c) comparing the pattern with the pattern 
exhibited by the reference when exposed to 
the composition. 

The present invention also provides methods for 
sequencing a segment of a polynucleotide comprising the steps 
of: 

a) combining: 

i) a substrate comprising a plurality of 
chemically synthesized and positionally 
distinguishable oligonucleotides capable of 
recognizing defined oligonucleotide 
sequences; and 

ii) a target polynucleotide; thereby forming 
high fidelity matched duplex structures of 
complementary subsequences of known 
sequence; and 

b) . determining which of said reagents have 

specifically interacted with subsequences in 
said target polynucleotide. 
In one embodiment r the segment is substantially the 
entire length of said polynucleotide- 



The invention also provides methods for sequencing a 
polymer, said method comprising the steps of: 

a) preparing a plurality of reagents which each 
specifically bind to a subsequence of 
preselected length ; 

b) positionally attaching each of said reagents to 
- one or more solid phase substrates f thereby 

'producing substrates of positionally definable 
sequence specific probes; 

c) combining said substrates with a target polymer 
whose sequence is to be determined; and 

d) determining which of said reagents have 
specifically interacted with subsequences in 
said target polymer. 

In one embodiment, the substrates are beads. 
Preferably, the plurality of reagents comprise substantially 
all possible subsequences of said preselected length found in 
said target. In another embodiment, the solid phase substrate 
is a single substrate having attached thereto reagents 
recognizing substantially all possible subsequences of 
preselected length found in said target. 

In another embodiment, the method further comprises 
the step of analyzing a plurality of said recognized 
subsequences to assemble a sequence of said target polymer. In 
5 a bead embodiment, at least some of the plurality of substrates 
have one subsequence specific reagent attached thereto, and the 
substrates are coded to indicate the sequence specificity of 
said reagent . 

The present invention also embraces a method of using 
0 a fluorescent nucleotide to detect interactions -with 
oligonucleotide probes of known sequence, said method 
comprising: 

a) attaching said nucleotide to a target unknown 
polynucleotide sequence , and 
5 b) exposing said target polynucleotide sequence to 

a collection of positionally defined 
oligonucleotide probes of known .sequences to 



determine the sequences of said probes which 
interact with said target. 
In a further refinement, an additional step is 
included of: 

5 a) collating said known sequences to determine the 

overlaps of said known sequences to determine 
-the sequence of said target sequence. 

9 

A method of mapping a plurality of sequences relative 
10 to one another is also provided, the method comprising: 

a) preparing a substrate having a plurality of 
positionally attached sequence specific probes 
are attached ; 

b) exposing each of said sequences to said 

15 substrate, thereby determining the patterns of 

interaction between said sequence specific 
probes and said sequences; and 

c) determining the relative locations of said 
sequence specific probe interactions on said 

20 sequences to determine the overlaps and order of 

said sequences. 
In one refinement, the sequence specific probes are 
oligonucleotides, applicable to where the target sequences are 
nucleic acid sequences, 
25 In the nucleic acid sequencing application, the steps 

* * of the sequencing process comprise: 

a) producing a matrix substrate having known 
positionally defined regions of known sequence 
specific oligonucleotide probes ; 

30 

b) hybridizing a target polynucleotide- to the 
positions on the matrix so that each of the 
positions which contain oligonucleotide probes 
complementary to a sequence on the target 

35 hybridize to the target molecule; 
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c) detecting which positions have bound the target, 
thereby determining sequences which are found on 
the target; and 

d) analyzing the known sequences contained in the 
target to determine sequence overlaps and 
"assembling the sequence of the target therefrom. 

The enablement of the sequencing process by 
hybridization is based in large part upon the ability to 
synthesize a large number (e*g., to virtually saturate) of the 
possible* .overlapping sequence segments and distinguishing those 
probes which hybridize with fidelity from those which have 
mismatched bases, and to analyze a highly complex pattern of 
hybridization results to determine the overlap regions. 

The detecting of the positions which bind the target 
sequence would typically be through a fluorescent label on the 
target* Although a fluorescent label is probably most 
convenient, other sorts of labels, e.g. , radioactive, enzyme 
linked, optically detectable, or spectroscopic labels may be 
used. Because the oligonucleotide probes are positionally 
defined, the location of the hybridized duplex will directly 
translate to the sequences which hybridize. Thus, upon 
analysis of the positions provides a collection of subsequences 
found within the target sequence. These subsequences are 
matched with respect to their overlaps so as to assemble an 
intact target sequence. 
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In one preferred embodiment, linker molecules 
are provided on a substrate* A terminal end of the 
linker molecules is provided with a reactive functional 
group protected with a photoremovable protective group. 
5 Using lithographic methods, the photoremovable protective 
group is exposed to light and removed from the linker 
molecules in first selected regions. The substrate is 
then washed or otherwise contacted with a first monomer 
that reacts'with exposed functional groups on the linker 

10 molecules. In a preferred embodiment, the monomer is an 
amino acid containing a photoremovable protective group 
at-i-ts amino or carboxy terminus and the linker molecule 
terminates in an amino or carboxy acid group bearing a 
photoremovable protective group. 

15 A second set of selected regions is, 

thereafter, exposed to light and the photoremovable 
protective group on the linker molecule/protected amino 
acid is removed at the second set of regions. The 
substrate is then contacted with a second monomer 

20 

containing a photoremovable protective group for reaction 
with exposed functional groups. This process is repeated 
to selectively apply monomers until polymers of a desired 
length and desired chemical sequence are obtained. 
2 5 Photolabile groups are then optionally removed and the 
sequence is, thereafter, optionally capped. Side chain 
protective groups, if present, are also removed. 

An improved method and apparatus for the pre- 
paration of polymers is disclosed. The method and 
30 . apparatus may be applied to synthesize a variety of 

polymers at known locations on a substrate. The method 
could be used to synthesize up to about 10 6 or more 
different sequences per cm 2 at known locations in some 

embodiments. 
35 - 
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The method enables greater ease in peptide 
synthesis because the physical separation of reagents is 
not required when growing polymer chains. The chains 
themselves are separated by different physical locations 
on the substrate, but the entire substrate is exposed to 
the various reagents as the synthesis is conducted. Dif- 
ferential reaction is achieved by selectively exposing 
reactive functional groups to, e.g., light, electric 
currents, or another spatially localized activator. 
Remaining areas on the substrate remain unreacted. 

By using the lithographic techniques disclosed 
herein, it is possible to direct light to relatively 
small and precisely known locations on the substrate. 
It is, therefore, possible to synthesize polymers of 
a known chemical sequence at known locations on the 
substrate. 

The resulting substrate will have a variety of 
uses including, for example, screening large numbers of 
polymers for biological activity. To screen for 
biological activity, the substrate is exposed to one 
or more receptors such as antibody whole cells, receptors 
on vesicles, lipids, or any one of a variety of other 
receptors. The receptors are preferably labeled with, 
for example, a fluorescent marker, radioactive marker, 
or a labeled antibody reactive with the receptor. The 
location of the marker on the substrate is detected 
with, for example, photon detection or autoradiographic 
techniques. Through knowledge of the sequence of the 
material at the location where binding is detected, it is 
possible to quickly determine which sequence binds with 
the receptor and, therefore, the technique can be used to 
screen large numbers of peptides. Other possible 
applications of the inventions herein include diagnostics 
in which various antibodies for particular receptors 
would be placed on a substrate and, for example, blood 
sera would be screened for immune deficiencies. Still 
further applications include, for example, selective 
"doping" of organic materials in semiconductor devices, 
and the like. 
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In connection with one aspect of the invention 
an improved reactor system for synthesizing polymers is 
also disclosed. The reactor system includes a substrate 
mount which engages a substrate around a periphery 
thereof. The substrate mount provides for a reactor 
space between the substrate and the mount through or into 
which reaction fluids are pumped or flowed. A mask is 
placed on Qr focused on the substrate and illuminated so 
as to deprotect selected regions of the substrate in the 
reactor space. A monomer is'pumped through the reactor 
sp_ace or otherwise contacted with the substrate and 
reacts with the deprotected regions. By selectively 
deprotecting regions on the substrate and flowing 
predetermined monomers through the reactor space, desired 
polymers at known locations may be synthesized. 

Improved detection apparatus and methods are 
also disclosed. The detection method and apparatus 
utilize a substrate having a large variety of polymer 
sequences at known locations on a surface thereof. The 
substrate is exposed to a f luorescently labeled receptor 
which binds to one or more of the polymer sequences. The 
substrate is placed in a microscope detection apparatus 
for identification of locations where binding takes 
place. The microscope detection apparatus includes a 
monochromatic or polychromatic light source for directing 
light at the substrate, means for detecting fluoresced 
light from the substrate, and means for determining a 
location of the fluoresced light. The means tot 
detecting light fluoresced on the substrate may in some 
embodiments include a photon counter. The means for 
determining a location of the fluoresced light may 
include an x/y translation table for the substrate. 
Translation of the slide and data collection are recorded 
and managed by an appropriately programmed digital 
computer . 
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A further understanding of the nature and 
advantages of the inventions herein may be realized by 
reference to the remaining portions of the specification 
and the attached drawings, 

BRIEF DESCRIPTION OF "THE DRAWINGS 

Fig. 1 illustrates a flow chart for sequence, 
fingerprint, or mapping analysis. 

Fig. 2 illustrates the proper function of a VLSIPS 
peptide synthesis. 

Fig. 3 illustrates the proper function of a VLSIPS 
dipeptide synthesis. 

Fig. 4 illustrates the process of a VLSIPS 
trinucleotide synthesis. 

Fig. 5 illustrates masking and irradiation of a 
substrate at a first location. The substrate is shown in 
cross-section; 

Fig. 6 illustratesTthe substrate after appli- 
cation of a monomer "A M ; 

Fig. 7 illustrates irradiation of the substrate 
at a second location ; 

Fig. 8 ..illustrates the substrate after appli- 
cation of monomer !, B"; 

Fig. 9 illustrates irradiation of the "A" 

monomer; 

Fig. io illustrates the substrate after a second 
application of "B"; 

Fig. H illustrates a completed substrate; 

Figs. 12A and 12B illustrate alternative 
embodiments of a reactor system for forming a plurality 
of polymers on a substrate; 

Tig. 19 ■ illustrates a detection apparatus for 
locating fluorescent markers on the substrate; 

Figs.l4A-lAM illustrate the method as it is 
applied to the production of the trimers of monomers "A" 
and "B"; 
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Figs. 15 A mid 15 Bare fluorescence traces 

for standard fluorescent beads; 

Figs.l6A. and 16B are fluorescence curves for 
NVOC slides not exposed and exposed to light 
respectively; 

Figs. 17 A to 17 D are fluorescence plots of 
slides exposed through 100 /ra, 50 nm, 20 /ra, and 10 fxm 
masks ; 

Fig". 18 i-l-lustr-ates fluorescence «f ^ .slide 
•with BKe peptide- Y<3GF3b on sei-ected regions of surfat:e- 
which has been- exposed to labeled- Her? antibody spiscffite 
-for th is, sequence-; 

Figs.i9A to- 19 D iim^uaLe fonnatioa of *nd- -a 
fluorescence ^iet of a slide with a checkerboard pattern 
of YGGFL and G6FL exposed to labeled Herz antibody. 
Fig.i? c illustrates a 500x500 jm mask which has been 
focused on the substrate according to Fig. 12 a while 
Fig. 19 d illustrates a 50x50 /ia mask placed in direct " 
contact with the substrate in accord with Fig. 12 B; . 

Fig. 20 is a fluorescence plot of YGGFL and 
PGGFL synthesized in a 50 fin checkerboard pattern; 

Fig. 21 is a fluorescence plot of YPGGFL and 
YGGFL synthesized in a 50 /ra checkerboard pattern ; 

Figs. 22 a and 22 b illustrate the mapping of 
> sixteen sequences synthesized on two different glass 
slides; 

Fig. 23 is a fluorescence plot of the slide 
illustrated in Fig. 22 a ; an d 

Fig. 24 • is a fluorescence plot of the slide " - 
illustrated in Fig. u b . 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 

I. overall Description 

A. general 

B. VLSIPS substrates 
5 c. binary masking 

D. applications 

E. detection methods and apparatus 

F. data analysis 

10 II. Theoretical Analysis ' 

x A# simple n-mer structure; theory 

B. complications # 

C. non-polynucleotide embodiments 

is ill."- Polynucleotide Sequencing , 

15 X1 -A preparation of substrate matrix 

B. labeling target polynucleotide 

C hybridization conditions 

" D*. detection; VLSIPS scanning 

E. analysis 

F. substrate reuse 

G. non-polynucleotide aspects 
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IV. Fingerprinting 

s>5 A. general . 

B. preparation of substrate matrix 

C. labeling target nucleotides 

D. hybridization conditions 

E. detection; VLSIPS scanning 
3Q f. analysis 

G . substrate reuse 

H. non-polynucleotide aspects 

V. Mapping 

35 A. general . 

B* preparation of substrate matrix 

C. labeling 

• • d. hybridization/specific interaction 

E. detection 

40 F. analysis 

G. substrate reuse 

H. non-polynucleotide aspects 

VI. Additional Screening 

45 a. specific interactions 

B. sequence comparisons 

C. categorizations 

D. statistical correlations 

50 VII. Formation of Substrate 

A. instrumentation 

B. binary masking 

C. synthetic methods 

D. surface immobilization 
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Hybridization/Specific Interaction 

A. general 

B. important parameters 

Detection Methods 

A. labeling techniques 

B. scanning system 

Data Analysis 

A. v general 

B. 'hardware 

C. software 

Substrate Reuse 

A. removal of label 

B. storage and preservation " 

C. processes to avoid degradation of oligomers 

Integrated Sequencing Strategy 
A* initial mapping strategy 

B. selection of smaller clones 

C. actual sequencing procedures 

Commercial Applications 

A. sequencing 

B. fingerprinting 

C. mapping 

* * * 

I. OVERALL DESCRIPTION 
A. General 

The present invention relies in part on the ability 
to synthesize or attach specific recognition reagents at known 
locations on a substrate, typically a single substrate. In 
particular, the present invention provides the ability to 
prepare a substrate having a very high density matrix pattern 
of positionally defined specific recognition reagents. The 
reagents are capable of interacting with their specific targets 
while attached to the substrate, e.g., solid phase 
interactions, and by appropriate labeling of these targets, the 
sites of the interactions between the target and the specific 
reagents may be derived. Because the reagents are positionally 
defined, the sites of the interactions will define the 
specificity of each interaction. As a result, a map of the 
patterns of interactions with specific reagents on the 
substrate is convertible into information on the specific 
interactions taking place, e.g., the recognized features. 
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Where the specific reagents recognize a large number of 
possible features, this system allows the determination of the 
combination of specific interactions which exist on the target 
molecule. Where the number of features is sufficiently large, 
the identical same combination, or pattern, of features is 
sufficiently unlikely that a particular target molecule may 
often be uniquely defined by its features. In the extreme, the 
features may actually be the subunit sequence of the target 
molecule, and a given target sequence, may be uniquely defined 
by its combination of features. 

In particular, the methodology is applicable to 
sequencing polynucleotides. The specific sequence recognition 
reagents will typically be oligonucleotide probes which 
hybridize with specificity to subsequences found on the target 
sequence, A sufficiently large number of those probes allows 
the fingerprinting of a target polynucleotide or the relative 
mapping of a collection of target polynucleotides, as described 
in greater detail below* 

In the high- resolution fingerprinting provided by a 
saturating collection of probes which include all possible 
subsequences of a given size, e.g., 10-mers, collating of all 
the subsequences and determination of specific overlaps will be 
derived and the entire sequence can usually be reconstructed. 

Although a polynucleotide sequence analysis is a 
preferred embodiment, for which the specific reagents are most 
easily accessible, .the invention is also applicable to analysis 
of other polymers, including polypeptides, carbohydrates, and 
synthetic polymers, including a-, , and ^-amino acids, 
polyurethanes , polyesters , polycarbonates , polyureas , 
polyamides , polyethyleneimines , polyarylene sulfides , 
polysiloxanes , polyimides, polyacetates , and mixed polymers. 
Various optical isomers, e,g., .various D- and L- forms of the 
monomers, may be used. 

Sequence analysis will take the form of complete 
sequence determination, to the level of the sequence of 
individual subunits along the entire length of the target 
sequence. Sequence analysis also takes the form of sequence 
homology, e.g., less than absolute subunit resolution, where 
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"similarity" in the sequence will be. detectable, or the form of 
selective sequences of homology interspersed at specific or 
irregular locations. 

In either case, the sequence is determinable at 
selective resolution or at particular locations. Thus, the 
hybridization method will be useful as a means for 
identif ication* e.g., a "fingerprint", much like a Southern 
hybridization method is used. It is also useful to map 
particular target sequences. 

B. VLSIPS Substrates 

The invention is enabled by the development of 
technology to prepare substrates on which specific reagents may 
be either positionally attached or synthesized. In particular, 
the very large scale* immobilized polymer synthesis (VLSIPS) 
technology allows for the very high density production of an 
enormous diversity of reagents mapped out in a known matrix 
pattern on a substrate. These reagents specifically recognize 
subsequences in a target polymer and bind thereto, producing a 
map of positionally defined regions of interaction. These map 
positions are convertible into actual features recognized, and 
thus would be present in the target molecule of interest. 

As indicated, the sequence specific recognition 
reagents will often be oligonucleotides which hybridize with 
fidelity and discrimination to the target sequence. For use 
with other polymers, monoclonal or polyclonal antibodies having 
high sequence specificity will often be used. 

In the generic sense, the VLSIPS technology allows 
the production of a substrate with a high density matrix of 
positionally mapped regions with specific recognition reagents 
attached at each distinct region. By use of protective groups 
which can be positionally removed, or added, the regions can be 
activated or deactivated for addition of particular reagents or 
compounds. Details of the protection are described below and 
in related application U.S. S.N. 07/492,462 (VLSIPS CIP) . In a 
preferred embodiment, photosensitive protecting agents will be 
used and the regions of activation or deactivation may be 
controlled by electro-optical and optical methods, similar to 
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many of the processes used in semiconductor wafer and chip 
fabrication. 

In the nucleic acid nucleotide sequencing 
application, a VLSIPS substrate is synthesized having 
positionally defined oligonucleotide probes. See U.S. S.N. 

07/492,462 (VLSIPS CIP) ; and U. S.S.N. / , , attorney 

docket number* 11509-2 8 (automated VLSIPS) . By ]use of masking 
technology and photosensitive synthetic subunits, the VLSIPS 
apparatus allows for the stepwise synthesis of polymers 
according to a positionally defined matrix pattern. Each 
oligonucleotide probe will be synthesized at known and defined 
positional locations on the substrate. This forms a matrix 
pattern of known relationship between position and specificity 
of interaction. The VLSIPS technology allows the production of 
a very large number* of different oligonucleotide probes to be 
simultaneously and automatically synthesized including numbers 
in excess of about 10 2 , 10 3 , 10 4 , 10 5 , 10 6 , or even more, and 
at densities of at least about 10 2 , 10 3 /cm 2 , 10 4 /cm 2 , 10 5 /cm 2 
and up to 10 /cm or more. This application discloses methods 
for synthesizing polymers on a silicon or other suitably 
derivatized substrate , methods and chemistry for synthesizing 
specific types of biological polymers on those substrates, 
apparatus for scanning and detecting whether interaction has 
occurred at specific locations on the substrate, and various 
other technologies related to the use of a high density very 
large scale immobilized polymer substrate. In particular, 
sequencing, fingerprinting, and mapping applications are 
discussed herein in detail, though related technologies are 
described in simultaneously filed applications U.S. S.N. 

/ , , attorney docket number 11509-28 (automated VLSIPS) 

and U.S. S.N. / , , attorney docket number 11509-16 

(sequencing by synthesis) , each of which is hereby incorporated 
herein -by reference. 

In other embodiments, antibody probes will be 
generated which specifically recognize particular subsequences 
found on a polymer. Antibodies would be generated which are 
specific for recognizing a three contiguous amino acid 
sequence, and monoclonal antibodies may be preferred. 
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Optimally, these antibodies would not recognize any sequences 
other than the specific three amino acid stretch desired and 
the binding affinity should be insensitive to flanking or 
remote sequences found on a target molecule. Likewise, 
antibodies specific for particular carbohydrate linkages or 
sequences will be generated. A similar approach could be used 
for preparing specific reagents which recognize other polymer 
subunit sequences. These reagents would typically be site 
specifically localized to a substrate matrix pattern where the 
regions are closely packed. 

These reagents could be individually attached at 
specific sites on the substrate in a matrix by an automated 
procedure where the regions are positionally targeted by some 
other specific mechanism, e.g., one which would allow the 
entire collection of reagents to be attached to the substrate 
in a single reaction. Each reagent could be separately 
attached to a specific oligonucleotide sequence by an automated 
procedure. This would produce a collection of reagents where, 
e.g., each monoclonal antibody would have a unique 
oligonucleotide sequence attached to it. By virtue of a VLSIPS 
substrate which has different complementary oligonucleotides 
synthesized on it, each monoclonal antibody would specifically 
be bound only at that site on the substrate where the 
complementary oligonucleotide has been synthesized. A 
crosslinking step would fix the reagent to the substrate. See, 
e.g., Dattagupta et al. (1985) U.S. Patent No. 4 ,542,102 and 
(1987) U.S. Pat. No. 4,713,326; and Chatterjee, M. et al. 

(1990) J. Am. Chem. Soc. 112:6997- , which are hereby 

incorporated herein by reference. This allows a high density 
positionally specific collection of specific recognition 
reagents, e.g., monoclonal antibodies, to be immobilized to a 
solid substrate using an automated system. 

The regions which define particular reagents will 
usually be generated by selective protecting groups which may 
be activated or deactivated. Typically the protecting group 
will be bound to a monomer subunit or spatial region, and can 
be spatially affected by an activator, such as electromagnetic 
radiation. Examples of protective groups with utility herein 
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include nitroveratryl oxycarbonyl (NVO'C) , nitrobenzyl 
oxycarbony (NBOC) , dimethyl dimethoxy benzyloxy carbonyl , 5- 
bromo-7-nitroindolinyl , O-hydroxy-a-methyl cinnamoyl , and 2- 
oxymethylene anthraquinone. Examples of activators include ion 
5 beams, electric fields, magnetic fields, electron beams, x- 
ray, and other forms of electromagnetic radiation. 

The present invention provides methods and 
apparatus for the preparation and use of a substrate 
having a plurality of polymer sequences in predefined 

10 regions. The invention is described herein primarily 
with regard to the preparation of molecules containing 
sequences. of amino acids, but could readily be applied 
in the preparation of other polymers. Such polymers 
include, for example, both linear and cyclic polymers 

15 of nucleic acids, polysaccharides, phospholipids, and 

peptides having either a-, or w-amino acids, hetero- 

polymers in which a known drug is covalently bound to any 
of the above, polyurethanes , polyesters, polycarbonates, 
polyureas , polyamides , polyethyleneimines , polyarylene 

20 sulfides, polysiloxanes, polyimides, polyacetates, or 

other polymers which will be apparent upon review of this 
disclosure. In a preferred embodiment, the invention 
herein is used in the synthesis of peptides. 

The prepared substrate may, for example, be 

25 used in screening a variety of polymers as ligands for 
binding with a receptor, although it will be apparent 
that the invention could be used for the synthesis of 
a receptor for binding with a ligand. The substrate 
disclosed herein will have a wide variety of other uses. 

30 Merely by way of example, the invention herein can be 
used in determining peptide and nucleic acid sequences 

which bind to proteins, finding sequence-specific binding 
drugs, identifying epitopes recognized by antibodies, 
and evaluation of a variety of drugs for clinical and 
35 diagnostic applications, as well as combinations of the 
above . 
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The invention preferably provides for the use 
of a substrate "S" with a surface. Linker molecules "L" 
are optionally provided on a surface of the substrate. 
The purpose of the linker molecules, in some embodiments, 
5 is to facilitate receptor recognition of the synthesized 
polymers. 

Optionally, the linker molecules may be 
chemically protected for storage purposes. A chemical 
storage protective group such as t-BOC (t-butoxycarbonyl) 

10 i&ay be used in some embodiments. Such chemical 

protective groups would be chemically removed upon 
exposure to, for example, acidic solution and would 
serve to protect the surface during storage and be 
removed prior to polymer preparation. 

15 On the substrate or a distal end of the linker 

molecules, a functional group with a protective group P Q 
is provided. The protective group P 0 may be removed upon 
exposure to radiation, electric fields, electric 
currents, or other activators to expose the functional 

20 group. 

In a preferred embodiment, the radiation is 
ultraviolet (UV) , infrared (IR) , or visible light. As 
more fully described below, the protective group may 
alternatively be an electrochemically-sensitive group 

25 which may be removed in the presence of an electric 

field. In still further alternative embodiments, ion 
beams, electron beams, or the like may be used for 
deprotection. 

In some embodiments, the exposed regions and, 

30 therefore, the area upon which each distinct polymer 

sequence is synthesized are smaller than about 1 cm 2 or 
less than 1 mm 2 . In preferred embodiments the exposed 

area is less than about 10,000 /im 2 or, more preferably, 
less than 100 /ra 2 and may, in some embodiments, encompass 
the binding site for as few as a single molecule. Within 
these regions, each polymer is preferably synthesized in 
a substantially pure form. 
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Concurrently or after exposure of a known 
region of the substrate to light, the surface is 
contacted with a first monomer unit M x which reacts 
with the functional group which has been exposed by 
the deprotection step. The "first monomer includes a 
protective group P 1# P 1 may or may not be the same as P 0 . 

Accordingly, after a first cycle, known first 
regions of the surface may comprise the sequence: 

S-L-M 1 -P 1 

while remaining regions of the surface comprise the 
sequence: 

S-L-P 0 . 

Thereafter, second regions of the surface (which may 
include the first region) are exposed to light and con- 
tacted with a second monomer M 2 (which may or may not be 
the same as M x ) having a protective group P 2 . P 2 may or 
may not be the same as P 0 and P x . After this second 
cycle, different regions of the substrate may comprise 
one or more of the following sequences: 

S-L«M 1 -M 2 -P 2 
S-L-M 2 -P 2 
S-L«M 1 -P 1 and/or 
S-L-P 0 . 

The above process is repeated until the substrate 
includes desired polymers of desired lengths. By 
controlling the locations of the substrate exposed 

to light and the reagents exposed to the substrate 
following exposure, the location of each sequence will 
be known. 
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Thereafter, the protective groups are removed 
from some or all of the substrate and the sequences are, 
optionally, capped with a capping unit C. The process 
results in a substrate having a surface with a plurality 
of polymers of the following general formula: 

S-[L]-(M i )-(M/P(M k ) . . . (M X )-[C] 

where square brackets indicate optional groups, and 
M ± ...M X indicates any sequence of monomers. The number of 
monomers could cover a wide variety of values, but in a 
preferred embodiment they will range from 2 to 100. 

In some embodiments a plurality of locations on 
the substrate polymers are to contain a common monomer 
subsequence. For example, it may be desired to. synthe- 
size a sequence S-M x -M 2 -M3 at first locations and a 
sequence S-M 4 -M 2 -M 3 at second locations. The process 
would commence with irradiation of the first locations 
followed by contacting with M x -P, resulting in the 
sequence S-M^-F at the first location. The second loca- 
tions would then be irradiated and contacted with M 4 -P, 
resulting in the sequence S-M 4 -P at the second locations. 
Thereafter both the first and second locations would be 
irradiated and contacted with the dimer M 2 -M 3 , resulting 
in the sequence S-M^M^Mg at the first locations and 
S— m — w —m at the second locations. Of course, common 

4 2 3 

subsequences of any length could be utilized including 
those in a range of 2 or more monomers, 2 to 100 
monomers, 2 to 20 monomers, and a most preferred range 
of 2 to 3 monomers. 

According to other embodiments, a set of masks 
is used for the first monomer layer and, thereafter, 
varied light wavelengths are used for selective 
deprotection. For example, in the process discussed 
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above, first regions are first exposed through a mask and 
reacted with a first monomer having a first protective 
group P lf which is removable upon exposure to a first 
wavelength of light (e.g., IR) . Second regions are 
masked and reacted with a second monomer having a second 
protecive group P 2 , which is removable upon exposure to a 
second wavelength of light (e.g., UV) . Thereafter, masks 
become unne9essary in the synthesis because the entire 
substrate may be exposed alternatively to the first and 
second wavelengths of light**Tn the deprotection cycle. 

The polymers prepared on a substrate according 
to the above methods will have a variety of uses includ- 
ing, for example, screening for biological activity. In 
. such screening activities, the substrate containing the 
sequences is exposed to an unlabeled or labeled receptor 
such as an antibody, receptor on a cell, phospholipid 
vesicle, or any one of a variety of other receptors. In 
one preferred embodiment the polymers are exposed to a 
first, unlabeled receptor of interest and, thereafter, 
exposed to a labeled receptor-specific recognition 
element, which is, for example, an antibody. This 
process will provide signal amplification in the 
detection stage. 

The receptor molecules may bind with one or 
more polymers on the substrate. The presence of the 
labeled receptor and, therefore, the presence of a 
sequence which binds with the receptor is detected in a 
preferred embodiment through the use of autoradiography, 
detection of fluorescence with a charge-coupled device, 
fluorescence microscopy, or the like. The sequence of 
the polymer at the locations where the receptor binding 
is detected may be used to determine all or part of a 
sequence which is complementary to the receptor. 

Use of the invention herein is illustrated 
primarily with reference to screening for biological 
activity. The invention will, however, find many other 
uses. For example, the invention may be used in 
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information storage (e.g., on optical disks), production 
of molecular electronic devices, production of stationary 
phases in separation sciences, production of dyes and 
brightening agents, photography, and in immobilization of 
cells, proteins, lectins, nucleic acids, polysaccharides 
and the like in patterns on a surface via molecular 
recognition of specific polymer sequences. By 
synthesizing the same compound in adjacent, progressively 
differing concentrations, a gradient will be established 
to control chemotaxis or to^evelop diagnostic dipsticks 
which, for example, titrate an antibody against an 
increasing amount of antigen. By synthesizing several 
catalyst molecules in close proximity, more efficient 
multistep conversions may be achieved by "coordinate 
immobilization." Coordinate immobilization also may be 
used for electron transfer systems, as well as to provide 
both structural integrity and other desirable properties 
to materials such as lubrication, wetting, etc. 

According to alternative embodiments, molecular 
biodistribution or pharmacokinetic properties may be 
examined. For example, to assess resistance to 
intestinal or serum proteases, polymers may be capped 
with a fluorescent tag and exposed to biological fluids 
of interest. 

Ill . Polvmer Synthesis 

Fig. 1 illustrates one embodiment of the 
invention disclosed herein in which a substrate 2 is 
shown in cross-section. Essentially, any conceivable 
substrate may be employed in the invention. The 
substrate may be biological, nonbiological , organic, 
inorganic, or a combination of any of these, existing as 
particles, strands, precipitates, gels, sheets, tubing, 
spheres, containers, capillaries, pads, slices, films, 
plates, slides, etc. The substrate may have any 
convenient shape, such as a disc, square, sphere, circle, 
etc. The substrate is preferably flat but may take on a 
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variety of alternative surface configurations. For 
example, the substrate may contain raised or depressed 
regions on which the synthesis takes place. The 
substrate and its surface preferably form a rigid support 
on which to carry out the reactions described herein. 
The substrate and its surface is also chosen to provide 
appropriate light-absorbing characteristics. For 
instance, the substrate may be a polymerized Langmuir 
Blodgett film, f unctionalized glass, Si, Ge, GaAs, GaP, 
Si0 2 , SiN 4 , modified silicori7~or any one of a wide 
variety of gels or polymers such as (poly) tetraf luoro- 
ethy iene , (poly ) vinyl idenedif luor ide , polystyrene , 
polycarbonate, or combinations thereof. Other substrate 
materials will be readily apparent to those of skill in 
the art upon review of this disclosure. In a preferred 
embodiment the substrate is flat glass or single-crystal 
silicon with surface relief features of less than 10 A. 

According to some embodiments, the surface of 
the substrate is etched using well known techniques to 
provide for desired surface features. For example, by 
way of the formation of trenches, v-grooves, mesa 
structures, or the like, the synthesis regions may be 
more closely placed within the focus point of impinging 
light, be provided with reflective "mirror" structures 
for maximization of light collection from fluorescent 
sources, or the like. 

Surfaces on the solid substrate will usually, 
though not always, be composed of the same material as 
the substrate. Thus, the surface may be composed of any 
of a wide variety of materials, for example, polymers, 
plastics, resins, polysaccharides, silica or silica-based 
materials, carbon, metals, inorganic glasses, membranes, 
or any of the above-listed substrate materials. In some 
embodiments the surface may provide for the use of caged 
binding members which are attached firmly to the surface 
of the substrate in accord with the teaching of copending 
application Serial No. 4 04,920, previously incorporated 
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herein by reference* Preferably, the surface will 
contain reactive groups, which could be carboxyl, amino, 
hydroxy 1, or the like. Most preferably, the surface will 
be optically transparent and will have surface Si-OH 
functionalities, such as are found on silica surfaces. 

The surface 4 of the substrate is preferably 
provided with a layer of linker molecules 6, although it 
will be understood that the linker molecules are not re- 
quired elements of the invention. The linker molecules 
are preferably of suf f icien£"length to permit polymers in 
a completed substrate to interact freely with molecules 
exposed to the substrate. The linker molecules should be 
6-50 atoms long to provide sufficient exposure. The 
linker molecules may be, for example, aryl acetylene, 
ethylene glycol oligomers containing 2-10 monomer units, 
diamines, diacids, amino acids, or combinations thereof. 
Other linker molecules may be used in light of this 
disclsoure. 

According to alternative embodiments, the 
linker molecules are selected based upon their 
hydrophilic/hydrophobic properties to improve 
presentation of synthesized polymers to certain 
receptors. For example, in the case of a hydrophilic 
receptor, hydrophilic linker molecules will be preferred 
so as to permit the receptor to more closely approach the 
synthesized polymer. 

According to another alternative embodiment, 
linker molecules are also provided with a photocleavable 
group at an intermediate position. The photocleavable 
group is preferably cleavable at a wavelength different 
from the protective group. This enables removal of the 
various polymers following completion of the synthesis by 
way of exposure to the different wavelengths of light.. 

The linker molecules can be attached to the 
substrate via carbon-carbon bonds using, for example, 
(poly) trif luorochloroethylene surfaces, or preferably, 
by siloxane bonds (using, for example, glass or silicon 
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oxide surfaces) . Siloxane bonds with the surface of the 
substrate may be formed in one embodiment via reactions 
of linker molecules bearing trichlorosilyl groups. The 
linker molecules may optionally be attached in an ordered 
array, i.e., as parts of the head groups in a polymerized 
Langmuir Blodgett film. In alternative embodiments, the 
linker molecules are adsorbed to the surface of the 
substrate. r 

The linker molecules and monomers used herein 
are provided with a functional group to which is bound a 
protective group. Preferably, the protective group is 
on the distal or terminal end of the linker molecule 
opposite the substrate. The protective group may be 
either a negative protective group (i.e., the protective 
group renders the linker molecules less reactive with a 
monomer upon exposure) or a positive protective group 
(i.e., the protective group renders the linker molecules 
more reactive with a monomer upon exposure) . In the case 
of negative protective groups an additional step of 
reactivation will be required. In some embodiments, 
this will be done by heating. 

The protective group on the linker molecules 
may be selected from a wide variety of positive light- 
reactive groups preferably including nitro aromatic 
compounds such as o-nitrobenzyl derivatives or benzyl sul- 
fonyl. In a preferred embodiment, 6-nitroveratryloxy- 
carbonyl (NVOC) , 2-nitrobenzyloxycarbonyl (NBOC) or 
a,a-dimethyl-dimethoxybenzyloxycarbonyl (DDZ) is used. 
In one embodiment, a nitro aromatic compound containing 
a benzyl ic hydrogen ortho to the nitro group is used, 
i.e., a chemical of the form: 



28 



where R x is 'alkoxy, alkyl, halo, aryl, alkenyl, or 
hydrogen; R 2 is alkoxy, alkyj^, halo, aryl, nitro, or 
hydrogen; R 3 is alkoxy, alkyl, halo, nitro, aryl, or 
hydrogen; R 4 is alkoxy, alkyl, hydrogen, aryl, halo, or 
nitro; and R 5 is alkyl, alkynyl, cyano, alkoxy, hydrogen, 
halo, aryl, or alkenyl- Other materials which may be 
used include o-hydroxy-a-methyl cinnamoyl derivatives. 
Photoremovable protective groups are described in, for 
example, Patchornik, J. Am, Chem. Soc. (1970) 92:6333 and 
Amit et al., J. Org. Chem. (1974) 39.: 192, both of which 
are incorporated herein by reference. 

In an alternative embodiment the positive 
reactive group is activated for reaction with reagents 
in solution. For example, a 5-bromo-7-nitro indoline 
group, when bound to a carbohyl, undergoes reaction upon 
exposure to light at 420 nm. 

In a second alternative embodiment, the 
reactive group on the linker molecule is selected from a 
wide variety of negative light-reactive groups including 
a cinammate group. 

Alternatively, the reactive group is activated 
or deactivated by electron beam lithography, x-ray 
lithography, or any other radiation. Suitable reactive 
groups for electron beam lithography include sulfonyl. 
Other methods may be used including, for example, expo- 
sure to a current source. Other reactive groups and 
methods of activation may be used in light of this 
disclosure. 
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As shown in Fig, 5 , the linking molecules 
are preferably exposed to, for example, light through a 
suitable mask 8 using photolithographic techniques of 
the type known in the semiconductor industry 
and described in, for example, Sze, VLSI Technology . 
McGraw-Hill (1983), and Mead et al . , Introduction to VLSI 
Systems . Addison-Wesley (1980) , which are incorporated 
herein by reference for all purposes . The light may be 
directed at either the surface containing the protective 
groups or at the back of theHsubstrate, so long as the 
substrate is transparent to the wavelength of light 
needed for removal of the protective groups. In the 
embodiment shown in Fig. 5 , light is directed at the 
surface of the substrate containing the protective 
groups. Fig. 5 illustrates the use of such masking 
techniques as they are applied to a positive reactive 
group so as to activate linking molecules and expose 
functional groups in areas 10a and 10b. 

The mask 8 is in one embodiment a transparent 
support material selectively coated with a layer of 
opaque material. Portions of the opaque material are 
removed, leaving opaque material in the precise pattern 
desired on the substrate surface. The mask is brought 
into close proximity with, imaged on, or brought directly 
into contact with the substrate surface as shown in . 
Fig. 5 , "Openings" in the mask correspond to locations 
on the substrate where, it is desired to remove 
photoremovable protective groups from the substrate. 
Alignment may be performed using conventional alignment 
techniques in which alignment marks (not shown) are used 
to accurately overlay successive masks with previous 
patterning steps, or more sophisticated techniques may 
be used. For example, interf erometric techniques 
such as the one described in Flanders et al. , "A New 
Interf erometric Alignment Technique," Add . Phvs . Lett . 
(1977) 3JL: 426-428, which is incorporated herein by 
reference, may be used. 
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To enhance contrast of light applied to 
the substrate, it is desirable to provide contrast 
enhancement materials between the mask and the substrate 
according to some embodiments. This contrast enhancement 
layer may comprise a molecule which is decomposed by 
light such as quinone diazid or a material which is 
transiently bleached at the wavelength of interest. 
Transient bleaching of materials will allow greater 
penetration where light is applied, thereby enhancing 
contrast. Alternatively, contrast enhancement may be 
provided by way of a cladded fiber optic bundle. 

The light may be from a conventional 
incandescent source, a laser, a laser diode, or the like. 
If non-collimated sources of light are used it may be 
desirable to provide a thick- or multi-layered mask to 
prevent spreading of the light onto the substrate. . It 
may, further, be desirable in some embodiments to utilize 
groups which are sensitive to different wavelengths to- 
control synthesis. For example, by using groups which 
are sensitive to different wavelengths, it is possible to 
select branch positions in the synthesis of a polymer or 
eliminate certain masking steps. Several reactive groups 
along with their corresponding wavelengths for 
deprotection are provided in Table 1. 



Table 1 



Approximate 

Group Deprotection Wa velength 

Nitroveratryloxy carbonyl (NVOC) UV (3 00-400 nxa) 

Nitrobenzyloxy carbonyl (NBOC) UV (300-350 run) 

Dimethyl dimethoxybenzyloxy carbonyl UV (280-300 ran) 
5-Bromo-7-nitroindolinyl UV (420 nm) 

o-Hydroxy-a-methyl cinnamoyl UV (3 00-350 nm) 

2-0xymethylene ant hr a quinone UV (3 50 nm) 
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While the invention is illustrated primarily 
herein by way of the use of a mask to illuminate selected 
regions the substrate, other techniques may also be used. 
For example, the substrate may be translated under a 
modulated laser or diode light source. Such techniques 
are discussed in, for example, U.S. Patent No. 4,719,615 
(Feyrer et al . ) , which is incorporated herein by 
reference. , In alternative embodiments a laser 
galvanometric scanner is utilized. In other embodiments, 
the synthesis may take place^on or in contact with a 
conventional liquid crystal (referred to herein as a 
"light valve") or fiber optic light sources. By 
appropriately modulating liquid crystals, light may be 
selectively controlled so as to permit light to contact 
selected regions of the substrate. Alternatively, 
synthesis may take place on the end of a series of 
optical fibers to which light is selectively applied. 
Other means of controlling the location of light exposure 
will be apparent to those of skill in the art. 

The substrate may be irradiated either in 
contact or not in contact with a solution (not shown) 
and is, preferably, irradiated in contact with a 
solution. The solution contains reagents to prevent the 
by-products formed by irradiation from interfering with 
synthesis of the polymer according to some embodiments . 
Such by-products might include, for example, carbon 
dioxide , nitrosocarbonyl compounds, styrene derivatives, 
indole derivatives, and products of their photochemical 
reactions. Alternatively, the solution may contain 
reagents used to match the index of refraction of the 
substrate. Reagents added to the solution may further 
include, for example, acidic or basic buffers, thiols, 
substituted hydrazines and hydroxylamines, reducing 
agents (e.g., NADH) or reagents known to react with a 
given functional group (e.g., aryl nitroso + glyoxylic 
acid -* aryl f ormhydroxamate + C0 2 ) . 
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Either concurrently with or after the 
irradiation step, the linker molecules are washed or 
otherwise contacted with a first monomer, illustrated 
by "A" in regions 12a and 12b in Fig. 6 . The first 
5 monomer reacts with the activated functional groups of 
the linkage molecules which have been exposed to light. 
The first monomer, which is preferably an amino acid, 
is also provided with a photoprotective group. The 
photoprotective group on the monomer may be the same 

10 as or different than the protective group used in the 
linkage molecules, and may be selected from any of the 
above^described protective groups. In one embodiment, 
the protective groups for the A monomer is selected from 
the group NBOC and NVOC . 

15 As shown in Fig. 7 , the process of irradiating 

is thereafter repeated, with a mask repositioned so- as to 
remove linkage protective groups and expose functional 
groups in regions 14a and 14b which are illustrated as 
being regions which were protected in the previous 

20 masking step. As an alternative to repositioning of 

the first mask, in many embodiments a second mask will 
be utilized. In other alternative embodiments, some 
steps may provide for illuminating a common region in 
successive steps. As shown in Fig. 7., it may be 

25 desirable to provide separation between irradiated 

regions. For example, separation of about 1-5 fim may 
be appropriate to account for alignment tolerances. 

As shown in Fig. 8 , the substrate is then ^. 
exposed to a second protected monomer "B," producing 

30 B regions 16a and 16b. Thereafter, the substrate is 

again masked so as to remove the protective groups and 
expose reactive groups on A region 12a and B region 16b. 
The substrate is again exposed to monomer B, resulting .in 
the formation of the structure shown in Fig. 10.'. The 

35 dimers B-A and B-B have been produced on the substrate. 

A subsequent series of masking and contacting 
steps similar to those described above with A (not shown) 
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provides the structure shown in Fig. 11* • The process 
provides all possible dimers of B and A, i.e., B-A, A~B, 
A- A, and B-B. 

The substrate, the area of synthesis, and the 
area for * synthesis of each individual polymer could be of 
any size or shape. For example, squares, ellipsoids, 
rectangles, triangles, circles, or portions thereof, 
along with irregular geometric shapes, may be utilized.- 
Duplicate synthesis areas may also be applied to a single 
substrate for purposes of re3undancy. 

In one embodiment the regions 12 and 16 on the 
substrate will have a surface area of between about 1 cm 2 
and 10" i0 cm 2 . In some embodiments the regions 12 and 16 
have areas, of less than about 10' 1 cm 2 , 10~ 2 cm 2 , 10~ 3 cm 2 , 
10~ 4 cm 2 , 10" 5 cm 2 , 10" 6 cm 2 , 10~ 7 cm 2 , 10' 8 cm 2 , or lCf 10 cm 2 . 
In a preferred embodiment, the regions 12 and 16 are 
between about 10x10 /im and 500x500 /xm. 

In some embodiments a single substrate supports 
more than about 10 different monomer sequences and 
perferably more than about 100 different monomer 
sequences, although in some embodiments more than about 
10 3 , 10 4 , 10 s , 10 6 , 10 7 , or 10 8 different sequences are 
provided on a substrate* Of course, within a region 
of the substrate in which a monomer sequence is 
synthesized, it is preferred that the monomer sequence 
be substantially pure. In some embodiments, regions of 
the substrate contain polymer sequences which are at 
least about 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%,^ 
45%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, or"" 
99% pure. 

According to some embodiments, several 
sequences are intentionally provided within a single 
region so as to provide an initial screening for 
biological activity, after which materials within regions 
exhibiting significant binding are further evaluated. 
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Iv * Detail s of One Embodiment: of a Reactor System 

Fig. 12 A schematically illustrates a preferred 
embodiment of a reactor system 100 for synthesizing 
polymers on the prepared substrate in accordance with one 
5 aspect of the invention. The reactor system includes a 
body 102 with a cavity 104 on a surface thereof. In 
preferred embodiments the cavity 104 is between about 50 
and 1000 nm deep with a depth of about 500 /xm preferred. 

The bottom of the cavity is preferably provided 

10 with an array of ridges 106**which extend both into the 
plane of the Figure and parallel to the plane of the 
Figure. The ridges are preferably about 50 to 200 tra 
deep and spaced at about 2 to 3mm. The purpose of the 
ridges is. to generate turbulent flow for better mixing. 

15 The bottom surface of the cavity is preferably light 

absorbing so as to prevent reflection of impinging light. 

A substrate 112 is mounted above the cavity 
104. The substrate is provided along its bottom surface 
114 with a photoremovable protective group such as NVOC 

20 with or without an intervening linker molecule. The 

substrate is preferably transparent to a wide spectrum of 
light, but in some embodiments is transparent only "at a 
wavelength at which the protective group may be removed 
(such as UV in the case of NVOC) . The substrate in some 

25 embodiments is a conventional microscope glass slide or 
cover slip. The substrate is preferably as thin as 
possible, while still providing adequate physical 
support. Preferably, the substrate is less than about 
1 mm thick, more preferably less than 0.5 mm thick, more 

30 preferably less than 0.1 mm thick, and most preferably 
less than 0.05 mm thick. In alternative preferred 
embodiments, the substrate is quartz or silicon. 

The substrate and the body serve to seal the 
cavity except for an inlet port 108 and an outlet port 

35 HO. The body and the substrate may be mated for sealing 
in some embodiments with one or more gaskets. According 
to a preferred embodiment, the body is provided with two 
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concentric gaskets and the intervening space is held at 
vacuum to ensure mating of the substrate to the gaskets* 

Fluid is pumped through the inlet port into the 
cavity by way of a pump 116 which may be, for example, a 
model no* B-120-S made by Eldex Laboratories. Selected 
fluids are circulated into the cavity by the pump, 
through the" cavity , and out the outlet for recirculatiofi" 
or disposal* The reactor may be subjected to ultrasonic 
radiation and/or heated to aid in agitation in some 
embodiments. 

Above the substrate 112, a lens 120 is provided 
which may be, for example, a 2" 100mm focal length fused 
silica lens. For the sake of a compact system, a 
reflective mirror 122 may be provided for directing 
light from a light source 124 onto the substrate. Light 
source 124 may be, for example, a Xe(Hg) light source 
manufactured by Oriel and having model no. 66024. A 
second lens 126 may be provided for the purpose of 
projecting a mask image onto the substrate in combination 
with lens 112. This form of lithography is referred to 
herein as projection printing. As will be apparent from 
this disclosure, proximity printing and the like may also 
be used according. to some embodiments. 

Light from the light source is permitted to 
reach only selected locations on the substrate as a 
result of mask 128. Mask 128 may be, for example, a 
glass slide having etched chrome thereon. The mask 128 
in one embodiment is provided with a grid of transparent 
locations and opaque locations. Such masks may be 
manufactured by, for example, Photo Sciences, * Inc. Light 
passes freely through the transparent regions of the 
mask, but is reflected from or absorbed by other regions. 
Therefore, only selected regions of the substrate are 
exposed to light. 

As discussed above, light valves (LCD's) 
may be used as an alternative to conventional masks 
to selectively expose regions of the substrate. 
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Alternatively, fiber optic faceplates such as those 
available from Schott Glass, Inc, may be used for the 
purpose of contrast enhancement of the mask or as the 
sole means of restricting the region to which light is 
applied. Such faceplates would be placed directly above 
or on the substrate in the reactor shown in Fig. 8A. In 
still further embodiments, flys-eye lenses, tapered fiber 
optic faceplates, or the like, may be used for contrast 
enhancement . 

In order to provide* for illumination of regions 
smaller than a wavelength of light, more elaborate 
techniques may be utilized. For example, according to 
one preferred embodiment, light is directed at the 
substrate by way of molecular microcrystals on the tip 
of, for example, micropipettes . Such devices are 
disclosed in Lieberman et al . , "A Light Source Smaller 
Than the Optical Wavelength," Science (1990) 247 :59-61, 
which is incorporated herein by reference for all 
purposes. 

In operation, the substrate is placed on the 
cavity and sealed thereto. All operations in the process 
of preparing the substrate are carried out in a room lit 
primarily or entirely by light of a wavelength outside of 
the light range at which the protective group is removed. 
For example, in the case of NVOC, the room should be lit 
with a conventional dark room light which provides little 
or no UV light. All operations are preferably conducted 
at about room temperature. 

A first, deprotection fluid (without a monomer) 
is circulated through the cavity. The solution 
preferably is of 5 mM sulfuric acid in dioxane solution 
which serves to keep exposed amino groups protonated and 
decreases their reactivity with photolysis by-products. 
Absorptive materials such as N,N-diethylamino 2,4- 
dinitrobenzene, for example, may be included in the 
deprotection fluid which serves to absorb light and 
prevent reflection and unwanted photolysis. 
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The slide is, thereafter, positioned in a light 
raypath from the mask such that first locations on the 
substrate are illuminated and, therefore, deprotected. 
In preferred embodiments the substrate is illuminated 
for between about 1 and 15 minutes with a preferred 
illumination time of about 10 minutes at 10-20 mW/cm 2 with 
365 nm light. The slides are neutralized (i.e., brought 
to a pH of ^bout 7) after photolysis with, for example, a 
solution of di-isopropylethylamine (DIEA) in methylene 
chloride for about 5 minutesT 

The first monomer is then placed at the first 
locations on the substrate. After irradiation, the slide 
is removed, treated in bulk, and then reinstalled in the 
flow cell. Alternatively, a fluid containing the first 
monomer, preferably also protected by a protective group, 
is circulated through the cavity by way of pump 116. If, 
for example, it is desired to attach the amino acid Y to 
the substrate at the first locations, the amino acid Y 
(bearing a protective group on its a-nitrogen) , along 
with reagents used to render the monomer reactive, and/ or 
a carrier, is circulated from a storage container 118, 
through the pump, through the cavity, and back to the 
inlet of the pump. 

The monomer carrier solution is, in a preferred 
embodiment, formed by mixing of a first solution 
(referred to herein as solution "A") and a second 
solution (referred to herein as solution "B") . Table 2 
provides an illustration of a mixture which may be used 
for solution A. 
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Table 2 

Representative Monomer Carrier Solution "A" 



100 mg NVOC amino protected amino acid 
37 mg HOBT (1-Hydroxybenzotriazole) 

250 fil DMF ( Dimethyl formamide) 
'86 ill DIEA (Diisopropylethylamine) 



The composition of solution B is illustrated in 
Table 3 . Solutions A and B are mixed and allowed to 
react at room temperature for about 8 minutes, then 
diluted with 2 ml of DMF, and 500 /xl are applied to the 
surface of the slide or the solution is circulated . 
through the reactor system and allowed to react for about 
2 hours at room temperature. The slide is then washed 
with DMF , methylene chloride and ethanol. 

Table 3 

Representative Monomer Carrier Solution- "B" 



250 Ml DMF 

111 mg BOP (Benzotriazolyl-n-oxy-tris (dimethyl amino) 
phosphoniumhexaf luorophosphate) 



As the solution containing the monomer to be 
attached is circulated through the cavity, the amino acid 
or other monomer will react at its carboxy terminus with 
amino groups on the regions of the substrate which have 
been deprotected. Of course, while the invention is 
illustrated by way of circulation of the monomer through 
the cavity, the invention could be practiced by way of 
removing the slide from the reactor and submersing it in 
an appropriate monomer solution. 
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After addition of the first monomer, the 
solution containing the first amino acid is then purged 
from the system. After circulation of a sufficient 
amount of the DMF/methylene chloride such that removal of 
the amino acid can be assured (e.g., about 50x times the 
volume of the cavity and carrier lines) , the mask or 
substrate is repositioned, or a new mask is utilized such 
that second regions on the substrate will be exposed to 
light and the light 124 is engaged for a second exposure. 
This will deprotect second regions on the substrate and 
the process is repeated until the desired polymer 
sequences have been synthesized. 

The entire derivatized substrate is then 
exposed to a receptor of interest, preferably labeled 
with, for example, a fluorescent marker, by circulation 
of a solution or suspension of the receptor through the 
cavity or by contacting the surface of the slide in bulk. 
The receptor will preferentially bind to certain regions 
of the substrate which contain complementary sequences. 

Antibodies are typically suspended in what is 
commoply referred to as "supercocktail," which may be, 
for example, a solution of about 1% BSA (bovine serum 
albumin), 0.5% Tween in PBS (phosphate buffered saline) 
buffer. The antibodies are diluted into the 
supercocktail buffer to a final concentration of, for 
example, about 0.1 to 4 /xg/ml. 

Fig. 12 B illustrates an alternative preferred 
embodiment of the reactor shown in Fig. 8A. According 
to this embodiment, the mask 128 is placed directly in 
contact with the substrate. Preferably, the etched 
portion of the mask is placed face down so as to reduce 
the effects of light dispersion. According to this 
embodiment, the imaging lenses 120 and 126 are not 
necessary because the mask is brought into close 
proximity with the substrate. 
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For purposes of increasing the signal-to-noise 
ratio of the technique, some embodiments of the invention 
provide for exposure of the substrate to a first labeled 
or unlabeled receptor followed by exposure of a labeled, 
second receptor (e.g., an antibody) which binds at 
multiple sites on the first receptor. If, for example, 
the first receptor is an antibody derived from a first 
species of an animal, the second receptor is an antibody 
derived from a second species directed to epitopes 
associated with the first species. In the case of a 
mouse antibody, for example, f luorescently labeled goat 
antibody or antiserum which is antimouse may be used to 
bind at multiple sites on the mouse antibody, providing 
several times the fluorescence compared to the attachment 
of a single mouse antibody at each binding site. This 
process may be repeated again with additional antibodies 
(e.g., goat-mouse-goat, etc.) for further signal 
ampl if icat ion . 

In preferred embodiments an ordered sequence of 
masks is utilized. In some embodiments it is possible to 
use as few as a single mask to synthesize all of the 
possible polymers of a given monomer set. 

If, for example, it is desired to synthesize 
all 16 dinucleotides from four bases, a 1 cm square 
synthesis region is divided conceptually into 16 boxes, 
each 0.25 cm wide. Denote the four monomer units by A, 
B, C, and D. The first reactions are carried out in four 
vertical columns, each 0.25 cm wide. The first .mask 
exposes the left-most column of boxes, where A is 
coupled. The second mask exposes the next column, 
where B is coupled; followed by a third mask, for the 
C column; and a final mask that exposes the right-most 
column, for D. The first, second, third, and fourth 
masks may be a single mask translated to different 
locations. 
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The process is repeated in the horizontal 
direction for the second unit of the dimer. This time, 
the masks allow exposure of horizontal rows, again 
0.25 cm wide. A, B, C, and D are sequentially coupled 
using masks that expose horizontal fourths of the 
reaction area. The resulting substrate contains all 
16 dinucleotides of four bases. 

The eight masks used to synthesize the 
dinucleotide are related to one another by translation or 
rotation. In fact, one mask*""can be used in all eight 
steps if it is suitably rotated and translated. For 
example, in the example above, a mask with a single 
transparent region could be sequentially used to expose 
each of the vertical columns, translated 90*, and then 
sequentially used to allow exposure of the horizontal 
rows . 

Tables 4 and 5 provide a simple computer 
program in Quick Basic for planning a masking program and 
a sample output, respectively, for the synthesis of a 
polymer chain of three monomers ("residues") having 
three different monomers in the first level, four 
different monomers in the second level, and five 
different monomers in the third level in a striped 
pattern. The output of the program is the number of 
cells, the number of "stripes" (light regions) on each 
mask, and the amount of translation required for each 
exposure of the mask. 
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Table 4 
Mask Strategy Program 



DEFINT A-Z 

DIM b(20), w<20), 1(500) 
F$ - "LPT1:" 

OPEN f $ FOR OUTPUT AS #1 

jmax - 3 'Number of residues 

b(l) - 3: b(2) - 4: b(3) - 5 'Number of building blocks for res 1,2,3 

g - 1: lmax(l) - 1 

FOR j - 1 TO jmax: g- g * b(j): NEXT j ~~ 
w(0) - 0: w(l> - g / b(l) 

PRINT #1, "MASK2.BAS DATE$ , TIME$ : PRINT #1, 
PRINT #1, USING "Number of residues-##" ; jmax 
FOR j - 1 TO jmax 

PRINT #1, USING " Residue ## ## building blocks"; j; b(j) 

NEXT j 
PRINT #1, " 

PRINT #1, USING "Number of cells-####" ; g: PRINT #1, 

FOR j - 2 TO jmax 

lmax(j) - lmax(j - 1) * b(j - 1) 

w(j) - w(j - 1) / b(j) 

NEXT j 

FOR j - 1 TO jmax 

PRINT #1, USING "Mask for residue ##" ; j: PRINT #1, 

PRINT #1, USING " Number of stripes-### n ; lmax(j) 

PRINT #1, USING n Width of each stripe-###"; w(j) 

FOR 1 - 1 TO lmax(j) 

a - 1 + (1 - 1) * w(j - 1) 

ae«a + v(j)-l 

. PRINT #1, USING " -Stripe ## begins at location ### and ends at ###" ; 1; a; ae 
NEXT 1 
PRINT #1, 

PRINT #1, USING " For each of ## building blocks, translate mask by ## 

cell(s)-; b(j); v(j), 

PRINT #1, : PRINT #1, : PRINT #1, 

NEXT j 



* Copyright 1990, Affymax N.V. 
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Table 5 
Masking Strategy Output 



Number of residues- 3 

Residue 1 3 building blocks 

Residue 2 4 building blocks 

Residue 3 5 building blocks 



Number of cells- 60 

t 

Mask for residue 1 

Number of stripes- 1 
Width of each stripe- 20* 

Stripe ~1- begins at location 1 and ends at .20 

For each of 3 building blocks, translate mask by 20 cell(s) 



Mask for residue 2 

Number of stripes- 3 
Width of each stripe- 5 

Stripe 1 begins at location 1 and ends at 5 
Stripe 2 begins at location 21 and ends at 25 
Stripe 3 begins at location 41 and ends at 45 

For each of 4 building blocks, translate mask by 5 cell(s) 



Mask for residue 3 

Number of stripes- 12 
Width of each stripe- 1 
Stripe 1 begins at location 
Stripe 2 begins at location 
Stripe 3 begins at location 
Stripe 4 begins at location 
Stripe 5 begins at location 
Stripe 6 begins at location 
Stripe 7 begins at location 
Stripe 8 begins at location 
Stripe 9 begins at location 
Stripe 10 begins at location 
Stripe 11 begins at location 
Stripe 12 begins at location 

For each of 5 building blocks, translate mask by 1 cell(s) 



1 


and 


ends 


at 


1 


6 


and 


ends 


at 


6 


11 


and 


ends 


at 


11 


16 


and 


ends 


at 


16 


21 


and 


ends 
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26 
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26 
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31 
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ends 
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56 


and 


ends 


at 


56 



• Copyright 1990, Affymax N.V. 
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V, Details of One Embodiment of 

A Fluorescent Detection Device 

Fig, 13 illustrates a fluorescent detection 
device for detecting f luorescently labeled receptors 
on a substrate. A substrate 112 is placed on an x/y 
translation table 202. In a preferred embodiment the x/y 
translation table is a model no. PM500-A1 manufactured 
by Newport Corporation. The x/y translation table is 
connected to and controlled by an appropriately 
programmed digital computer ^20 4 which may be, for 
example, an appropriately programmed IBM PC/AT or AT 
compatible computer. Of course, other computer systems, 
special purpose hardware, or the like could readily be 
substituted for the AT computer used herein for 
illustration. Computer software for the translation 
and data collection functions described herein can be 
provided based on commercially available software 
including, for example, "Lab Windows" licensed by 
National Instruments, which is incorporated herein by 
20 reference for all purposes. 

The substrate and x/y translation table are 
placed under a microscope 206 which includes one or more 
objectives 208. Light (about 488 ran) from a laser 210, 
which in some embodiments is a model no. 202 0-05 argon 
25 ion laser manufactured by Spectraphysics, is directed at 
the substrate by a dichroic mirror 207 which passes 
greater than about 520 ran light but reflects 488 ran 
light. Dichroic mirror 207 may be., for example, a model 
no. FT510 manufactured by Carl Zeiss. Light reflected 
30 from the mirror then enters the microscope 206 which may 
be, for example, a model no. Axioscop 20 manufactured 
by Carl Zeiss. Fluorescein-marked materials on the 
substrate will fluoresce >488 ran light, and the 
fluoresced light will be collected by the microscope 
35 and passed through the mirror. The fluorescent light 

from the substrate is then directed through a wavelength 
filter 209 and, thereafter through an aperture plate 211. 
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Wavelength filter 209 may be, for example, a model 
no, OG53 0 manufactured by Melles Griot and aperture 
plate 211 may be, for example, a model no. 477352/477380 
manufactured by Carl Zeiss. 

The fluoresced light then enters a 
photomultiplier tube 212 which in some embodiments is a 
model no. R943-02 manufactured by Hamamatsu, the signal 
is amplified in preamplifier 214 and photons are counted 
by photon counter 216. The number of photons is recorded 
as a function of the locatiorf in the computer 204. 
Pre-Amp 214 may be, for example, a model no. SR440 
manufactured by Stanford Research Systems and photon 
counter 216 may be a model no. SR400 manufactured by 
Stanford Research Systems. The substrate is then moved 
to a subsequent location and the process is repeated. 
In preferred embodiments the data are acquired every 1 to 
100 Mm with a data collection diameter of about 0.8 to 
10 ^m preferred. In embodiments with sufficiently high 
fluorescence, a CCD detector with broadfield illumination 

is utilized. 

By counting the number of photons generated in 
a given area in response to the laser, it is possible to 
determine where fluorescent marked molecules are located 
on the substrate. Consequently, for a slide which has a 
matrix of polypeptides, for example, synthesized on the 
surface thereof, it is possible to determine which of the 
polypeptides is complementary to a f luorescently marked 
receptor. 

According to preferred embodiments, the 
intensity and duration of the light applied to the 
substrate is controlled by varying the laser power and 
scan stage rate for improved signal-to-noise ratio by 
maximizing fluorescence emission and minimizing 
background noise. 

While the detection apparatus has been 
illustrated primarily herein with regard to the detection 
of marked receptors, the invention will find application 
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in other areas. For example, the detection apparatus 
disclosed herein could be used in the fields of 
catalysis, DNA or protein gel scanning, and the like. 

VI. Determination of Relative 

Binding Strength of Receptors 

The signal-to-noise ratio of the present 
invention is sufficiently high that not only can the 
presence or absence of a receptor on a ligand be 
detected, but also the relative binding affinity of 
receptors to a variety of sequences can be determined. 

In practice it is found that a receptor will 
bind to several peptide sequences in an array, but will 
bind much more strongly to some sequences than others. 
Strong binding affinity will be evidenced herein by a 
strong fluorescent or radiographic signal since many 
receptor molecules will bind in a region of a strongly 
bound ligand. Conversely, a weak binding affinity will 
be evidenced by a weak fluorescent or radiographic signal 
due to the relatively small number of receptor molecules 
which bind in a particular region of a substrate having a 
ligand with a weak binding affinity for the receptor. 
Consequently, it becomes possible to determine relative 
binding avidity (or affinity in the case of univalent 
interactions) of a ligand herein by way of the intensity 
of a fluorescent or radiographic signal in a region 
containing that ligand. 

Semiquantitative data on affinities might 
also be obtained by varying washing conditions and 
concentrations of the receptor. This would be done by 
comparison to known ligand receptor pairs, for example, 

VII. Examples 

The following examples are provided to 
illustrate the efficacy of the inventions herein. All 
operations were conducted at about ambient temperatures 
and pressures unless indicated to the contrary* 
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A. Slide Preparation 

Before attachment of reactive groups it is 
preferred to clean the substrate which is f in a preferred 
embodiment a glass substrate such as a microscope slide 
or cover slip. According to one embodiment the slide is 
soaked in an alkaline bath consisting of, for example, 
1 liter of 95% ethanol with 120 ml of water and 120 grams 
of sodium hydroxide for 12 hours. The slides are then 
washed under running water and allowed to air dry, and 
rinsed once with a solution"cTf 95% ethanol. 

The slides are then aminated with, for example, 
aminopropyltriethoxysilane for the purpose of attaching 
amino groups to the glass surface on linker molecules, 
although any omega f unctionalized silane could also be 
used for this purpose. In one embodiment 0.1% 
aminopropyltriethoxysilane is utilized, although 
solutions with concentrations from 10' 7 % to 10% may be 
used, with about 10" 3 % to 2% preferred. A 0.1% mixture 
is prepared by adding to 100 ml of a 95% ethanol/5% water 
mixture, 100 microliters (jil) of aminopropyltriethoxy- 
silane. The mixture is agitated at about ambient 
temperature on a rotary shaker for about 5 minutes. 
500 ^1 of this mixture is then applied to the surface 
of one side of each cleaned slide. After 4 minutes, the 
slides are decanted of this solution and rinsed three 
times by dipping in, for example, 100% ethanol. 

After the plates dry, they are placed in a 
110-120 *C vacuum oven for about 20 minutes, and then 
allowed to cure at room temperature for about 12 hours 
in an argon environment. The slides are then dipped into 
DMF (dimethylformamide) solution, followed by a thorough 
washing with methylene chloride. 

The aminated surface of the slide is then 
exposed to about 500 pi of, for example, a 30 millimolar 
(mM) solution of NVOOGABA (gamma amino butyric acid) NHS 
(N-hydroxysuccinimide) in DMF for attachment of a NVOC- 
GABA to each of the amino groups. 
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The surface is washed with, for example, DMF, 
methylene chloride, and ethanol. 

Any unreacted aminopropyl silane on the 
surface — that is, those amino groups which have not had 
the NVOC-GABA attached — are now capped with acetyl groups 
(to prevent further reaction) by exposure to a 1:3 
mixture of acetic anhydride in pyridine for 1 hour. 
Other materials which may perform this residual capping 
function include trif luoroacetic anhydride, formicacetic 
anhydride, or other reactive^acylating agents. Finally, 
the slides are washed again with DMF, methylene chloride, 
and ethanol. 

B. Synthesis of Eight Trimers of "A" and "B" 
Fig. 14 illustrates a possible synthesis of 
the eight trimers of the two-monomer set: gly, phe 
(represented by "A" and "B," respectively). A glass 
slide bearing silane groups terminating in 6-nitro- 
veratryloxycarboxamide (NV0C-NH) residues is prepared 
as a substrate. Active esters (pentafluorophenyl, OBt,- 
etc.) of gly and phe protected at the- amino group with 
NV0C are prepared as reagents. While not pertinent to 
this example, if side chain protecting groups are 
required for the monomer set, these must not be 
photoreactive at the wavelength of light used to 
protect the primary chain. 

For a monomer set of size n, n x I cycles 
are required to synthesize all possible sequences of 
length £. A cycle consists of: 

1. Irradiation through an appropriate mask 
to expose the amino groups at the sites 
where the next residue is to be added, 
with appropriate washes to remove the 
by-products of the deprotection. 

2. Addition of a single activated and 
protected (with the* same photochemically- 
removable group) monomer, which will react 
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only at the sites addressed in step 1, with 
appropriate washes to remove the excess 
reagent from the surface. 

The above cycle is repeated for each member of 
the monomer set until each location on the surface has 
been extended by one residue in one embodiment. In other 
embodiments, several residues are sequentially added at 
one location before moving on to the next location. 
Cycle times will generally be limited by the coupling 
reaction rate, now as short *as 20 min in automated 
peptide synthesizers. This step is optionally followed 
by addition of .a protecting group to stabilize the array 
for later testing. For some types of polymers 
(e.g., peptides), a final deprotection of the entire 
surface (removal of photoprotective side chain groups) 
may be required. 

More particularly, as shown in Fig. 14 A, the 
glass 20 is provided with regions 22, 24, 26, 28, 30, 32, 
34, and 36. Regions 30, 32, 34, and 36 are masked, as 
shown in Fig. 14 B and the glass is irradiated and ex- 
posed to a reagent containg "A" (e.g., gly) , with the 
resulting structure shown in Fig. 14 c . Thereafter, 
regions 22, 24, 26, and 28 are masked, the glass is 
irradiated (as shown in Fig. 14 d ) and exposed to a 
reagent containing M B" (e.g., phe) , with the resulting 
structure shown in Fig. 14 e . The process proceeds, 
consecutively masking and exposing the sections as shown 
until the structure shown in Fig. 14 ty*is obtained." Thi 
glass is irradiated and the terminal groups are, 
optionally, capped by acetylation. As shown, all 
possible trimers of gly/phe are obtained. 

In this example, no side chain protective 
group removal is necessary. If it is desired, side chain 
deprotection may be accomplished by treatment with 
ethanedithiol and trif luoroacetic acid. 

In general, the number of steps needed to 
obtain a particular polymer chain is defined by: 
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n x £ (1) 



where : 

n = the number of monomers in the basis set of 

monomers , and 

£ = the number of monomer units in a polymer 

chain. * 

Conversely, the synthesized number of sequences 
of length £ will be: ~~ 

n*. (2) 

Of course, greater diversity is obtained by 
using masking strategies which will also include the 
synthesis of polymers having a length of less than £. 
If, in the extreme case, all polymers having a length 
less than or equal to £ are synthesized, the number of 
polymers synthesized will be: 

n c + n^ 1 + ... + n 1 . (3) 

The maximum number of lithographic steps needed 
will generally be n for each "layer" of monomers, i.e., 
the total number of masks (and, therefore, the number of 
lithographic steps) needed will be n x £ . The size of 
the transparent mask regions will vary in accordance with 
the area of the substrate available for synthesis and the 
number of sequences to be formed. In general, the size 
of the synthesis areas will be: 

size of synthesis areas = (A)/(S) 

where : 

A is the total area available for synthesis; 

and 
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S is the number of sequences desired in the 

area. 

It will be appreciated by those of skill in 
the art that the above method could readily be used to 
simultaneously produce thousands or millions of oligomers 
on a substrate using the photolithographic techniques 
disclosed herein. Consequently, the method results in 
the ability to practically test large numbers of, for 
example, di, tri, tetra, penta, hexa, hepta, 
octapeptides, dodecapeptides , or larger polypeptides 
(or correspondingly, polynucleotides) . 

The above example has illustrated the method 
by way of a manual example. It will of course be 
appreciated that automated or semi-automated methods 
could be used. The substrate would be mounted in a flow 
cell for automated addition and removal of reagents, to 
minimize the volume of reagents needed, and to more 
carefully control reaction conditions. Successive masks 
could be applied manually or automatically. 

C. Synthesis of a Dimer of an Aminopropyl 
Group and a Fluorescent Group 
In synthesizing the dimer of an aminopropyl 
group and a fluorescent group, a f unctionalized durapore 
membrane was used as a substrate. The durapore membrane 
was a polyvinylidine difluoride with aminopropyl groups. 
The aminopropyl groups were protected with the DDZ group 
by reaction of the carbonyl chloride with the amino 
groups, a reaction readily known to those of skill in 
the art. The surface bearing these groups was placed 
in a solution of THF and contacted with a mask bearing 
a checkerboard pattern of 1 mm opaque and transparent 
regions. The mask was exposed to ultraviolet light 
having a wavelength down to at least about 280 nm for 
about 5 minutes at ambient temperature, although a wide 
range of exposure times and temperatures may be 
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appropriate in various embodiments of the invention. 
For example, in one embodiment, an exposure time of 
between about 1 and 5000 seconds may be used at process 
temperatures of between -70 and +50 'C. 

In one preferred embodiment, exposure times of 
between about 1 and 500 seconds at about ambient pressure 
are used. In some preferred embodiments, pressure above 
ambient is Used to prevent evaporation. 

The surface of the membrane was then washed for 
about 1 hour with a fluorescent label which included an 
active ester bound to a chelate of a lanthanide. Wash 
times will vary over a wide range of values from about a 
few minutes to a few hours. These materials fluoresce 
in the red and the green visible region. After the 
reaction with the active ester in the f luorophore was 
complete, the locations in which the f luorophore was 
bound could be visualized by exposing them to ultraviolet 
light and observing the red and the green fluorescence. 
It was observed that the derivatized regions of the 
substrate closely corresponded to the original pattern 
of the mask. 

D. Demonstration of Signal Capability 

Signal detection capability was demonstrated 
using a low-level standard fluorescent bead kit 
manufactured by Flow cytometry Standarda and having model 
no. 824. This kit includes 5.8 *ra diameter beads, each 
impregnated with a known number of fluorescein molecules. 

One of the beads was placed in the illumination 
field on the scan stage as shown in Fig- 9 in a field of 
a laser spot which was initially shuttered. After being 
positioned in the illumination field, the photon 
detection equipment was turned on. The laser beam was 
unblocked and it interacted with the particle bead, 
which then fluoresced. Fluorescence curves of beads 
impregnated with 7,000; 13,000; and 29,000 fluorescein 
molecules, are shown in Figs. 11A, 11B, and 11C 
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respectively. On each curve, traces for beads without 
fluorescein molecules are also shown. These experiments 
were performed with 488 ran excitation, with 100 of 
laser power. The light was focused through a 40 power 
0.75 NA objective. 

The fluorescence intensity in all cases started 
off at a high value and then decreased exponentially. 
The fall-off in intensity is due to photobleaching of 
the fluorescein molecules. The traces of beads without 
fluorescein molecules are used for background 
subtraction. The difference in the initial exponential 
decay between labeled and nonlabeled beads is integrated 
to give the total number of photon counts, and this 
number is related to the number of molecules per bead. 
Therefore, it is possible to deduce the number of photons 
per fluorescein molecule that can be detected. For the 
curves illustrated in Fig. 11*, this calculation indicates 
the radiation of about 40 to 50 photons per fluorescein 
molecule are detected. 

E. Determination of the Number of 
Molecules Per Unit Area 

Aminopropylated glass microscope slides 
prepared according to the methods discussed above were 
utilized in order to establish the density of labeling of 
the slides. The free amino termini of the slides were 
reacted with FITC (fluorescein isothiocyanate) which 
forms a covalent linkage with the amino group. The slide 
is then scanned to count the number of fluorescent 
photons generated in a region which, using the estimated 
4 0-50 photons per fluorescent molecule, enables the 
calculation of the number of molecules which are on the 
surface per unit area. 

A slide with aminopropyl silane on its surface 
was immersed in a 1 mM solution of FITC in DMF for 
1 hour at about ambient temperature. After reaction, the 
slide was washed twice with DMF and then washed with 
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ethanol, water, and then ethanol again. It was then 
dried and stored in the dark until it was ready to be 
examined . 

Through the use of curves similar to those 
shown in Fig. 15 and by integrating the fluorescent 
counts under the exponentially decaying signal, the 
number of free amino groups on the surface after 
derivitization was determined. It was determined that 
slides with labeling densities of 1 fluoroscein per 
10 3 xl0 3 to -2x2 nm could be reproducibly made as the 
concentration of aminopropyltriethoxysilane varied from 
1(T 5 % to 10 -1 %. 

F. Removal of NVOC and Attachment of 
A Fluorescent Marker 

NVOC-GABA groups were attached as described 
above. The entire surface of one slide was exposed to 
light so as to expose a free amino group at the end of 
the gamma amino butyric acid. This slide, and a 
duplicate which was not exposed, were then exposed 
to fluorescein isothiocyanate (FITC) . 

Fig. 1 ^ A illustrates the slide which was not 
exposed to light, but which was exposed to FITC. The 
units of the x axis are time and the units of the y axis 
are counts. The trace contains a certain amount of 
background fluorescence. The duplicate slide was exposed 
•to 350 nm broadband illumination for about 1 minute 
(12 mW/cm 2 , -350 nm illumination) , washed and reacted j-j- 
with FITC. The fluorescence curves for this slide are 
shown in Fig. 16 B . A large increase in the level of 
fluorescence is observed, which indicates photolysis 
has exposed a number of amino groups on the surface of 
the slides for attachment of a fluorescent marker. 
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G. Use of a Mask in Removal of NVOC 

The next experiment was performed with a 0.1% 
aminopropylated slide. Light from a Hg-Xe arc lamp 
was imaged onto the substrate through a laser-ablated 
chrome-on-glass mask in direct contact with the 
substrate . 

This slide was-TTXiimTftat^a^lEor approximately 5 
minutes, with 12 mW of 350 nm broadband light and then- 
reacted with the 1 mM FITC solution. It was put on the 
laser detection scanning stage and a graph was plotted as 
a two-dimensional representation of position color-coded 
for fluorescence intensity. The fluorescence intensity 
(in counts) as a function of location is given on the 
color scale to the right of Figure 17 A for a mask having 
100x100 /xm squares. 

The experiment was repeated a number of times 
through various masks. The fluorescence pattern for a 
50 jra mask is illustrated in Fig. 17 B , for a 20 |im mask 
in Fig. 17 c , and for a 10 jra mask in Fig. 17 D # The mask 
pattern is distinct down to at least about 10 jra squares 
using this lithographic technique. 

H. Attachment of YGGFL and Subsequen t Exposure to 
Herz Antibody and Goat Antimouse 
In order to establish that receptors to a 
particular polypeptide sequence would bind to a surface- 
bound peptide and be detected, Leu enkephalin was coupled 
to the surface and recognized by an antibody. A slide _ 
was derivatized with 0.1% amino propyl-triethoxysilane 
and protected with NVOC. A 500 /im checkerboard mask was 
used to expose the slide in a flow cell using backside 
contact printing. The Leu enkephalin sequence (H 2 N- 
tyrosine , glycine , glycine , phenylalanine , leucine-C0 2 H , 
otherwise referred to herein as YGGFL) was attached via 
its carboxy end to the exposed amino groups on the 
surface of the slide. The peptide was added in DMF 
solution with the BOP/HOBT/DIEA coupling reagents and 
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recirculated through the flow cell for 2 hours at room 
t emper a tur e . 

A first antibody, known as the Herz antibody, 
was applied to the surface of the slide for 45 minutes 
at 2 iiq/ixl in a supercocktail (containing 1% BSA and 
1% ovalbumin also in this case). A second antibody, 
goat anti-mouse fluorescein conjugate, was then added 
at 2 ^g/rnl in the supercocktail buffer, and allowed to 
incubate for 2 hours. 

The results of this" experiment are provided in 
Fig* .18.. Again, this figure illustrates fluorescence 
intensity as a function of position. The fluorescence 
scale is shown on the right, according to the color 
coding. This image was taken at 10 /xm steps. This 
figure indicates that not only can deprotection be 
carried out in a well defined pattern, but also, that (1) 
the method provides for successful coupling of peptides 
to the surface of the substrate, (2) the surface of a 
bound peptide is available for binding with ah antibody, 
and (3) that the detection apparatus capabilities are 
sufficient to detect binding of a receptor. 

I. Monomer-bv-Monomer Formation of YGGFL and 
Subsequent Exposure to Labeled Antibody 
Monomer-by-monomer synthesis of YGGFL and GGFL 
in alternate squares was performed on a slide in a 
checkerboard pattern and the resulting slide was exposed 
to the Herz antibody. This experiment and the results 
thereof are illustrated in Figs. 1.9. A, 19 B,19 C , and 19 D;' 

In Fig. 19 A ^ a slide is shown which is 
derivatized with the aminopropyl group, protected in this 
case with t-BOC (t-butoxycarbonyl) . The slide was 
treated with TFA to remove the t-BOC protecting group. 
E-aminocaproic acid, which was t-BOC protected at its 
amino group, was then coupled onto the aminopropyl 
groups. The aminocaproic acid serves as a spacer between 
the aminopropyl group and the peptide to be synthesized. 
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The amino end of the spacer was deprotected and coupled 
to NVOC-leucine. The entire slide was then illuminated 
with 12 mW of 3?5 ran broadband illumination. The slide 
was then coupled with NVOC-phenyl alanine and washed. The 
5 entire slide was again illuminated, then coupled to 

NVOC-glycine and washed. The slide was again illuminated 
and coupled to NVOC-glycine to form the sequence shown in 
. the last portion of Fig. 19 A. 

As shown in Fig. 19 B, alternating regions of 

10 the slide were then illuminated using a projection print 
using. a 500x500 ^m checkerboard mask; thus, the amino 
group of glycine was exposed only in the lighted areas. 
When the next coupling chemistry step was carried out, 
NVOC-tyrosine was added, and it coupled only at those 

15 spots which had received illumination. The entire slide 
was then illuminated to remove all the NVOC groups, 
leaving a checkerboard of YGGFL in the lighted areas and 
in the other areas, GGFL. The Herz antibody (which 
recognizes the YGGFLr, but not GGFL) was then added, 

20 followed by goat anti-mouse fluorescein conjugate. 

The resulting fluorescence scan is shown in 
Fig. 19 C r and the color coding for the fluorescence 
intensity is again given on the right. Dark areas 
contain the tetrapeptide GGFL, which is not recognized by 

25 the Herz antibody (and thus there is no binding of the 
goat anti-mouse antibody with fluorescein conjugate) , 
and in the red areas YGGFL is present. The YGGFL 
pentapeptide is recognized by the Herz antibody and, 
therefore, there is antibody in the lighted regions for 

30 the fluorescein-conjugated goat anti-mouse to recognize. 

Similar patterns are shown for a 50 jim mask 
used in direct contact ("proximity print") with the 
substrate in Fig. 19 d . No t e that the pattern is more 
distinct and the corners of the checkerboard pattern are 

35 touching when the mask is placed in direct contact with 
the substrate (which reflects the increase in resolution 
using this technique) . 
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J. Monomer -by -Monomer Synthesis of YGGFL and PGGFL 
A synthesis using a 50 /im checkerboard mask 
similar, to that shown in Fig. 19 was conducted. However, 
P was added to the GGFL sites on the substrate through an 
additional coupling step. P was added by exposing 
protected GGFL to light through a mask, and subsequence 
exposure to P in the manner set forth above. Therefore, 
half of the regions on the substrate contained YGGFL and 
the remaining half containecT"fc>GGFL. 

The fluorescence plot for this experiment is 
provided in Fig. 20 • As shown, the regions are again 
readily disceraable. This experiment demonstrates that 
antibodies are able to recognize a specific sequence and 
that the recognition is not length-dependent. 

K. Monomer-bv -Monomer Synthesis 
of YGGFL and YPGGFL 

In order to further demonstrate the operability 
of the invention, a 50 /im checkerboard pattern of 
alternating YGGFL and YPGGFL was synthesized on a 
substrate using techniques like those set forth above. 
The resulting fluorescence plot is provided in Fig. 21 . 
Again, it is seen that the antibody is clearly able 
to recognize the YGGFL sequence and does not bind 
significantly at the YPGGFL regions. 

L. Synthesis of an Array of Sixteen Different 

Amino Acid Sequences and Estimation of Relat ive 
Binding Affinity to Herz Antibody 
Using techniques similar to those set forth 
above, an array of 16 different amino acid sequences 
(replicated four times) was synthesized on each of two 
glass substrates. The sequences were synthesized by 
attaching the sequence NVOC-GFL across the entire 
surface of the slides. Using a series of masks, two 
layers of amino acids were then selectively applied 
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to the substrate. Each region had dimensions of 
0.25 cm x 0.0625 cm. The first slide contained amino 
acid sequences containing only L amino acids while the 
second slide contained selected D amino acids. Figs. 18A 
and 18B illustrate a map of the various regions on the 
first and second slides, respectively. The patterns 
shown in Figs.22A . and 22 B were duplicated four times on 
each slide. The slides were then exposed to the Herz 
antibody and f luorescein-labeled goat anti-mouse. 

Fig. 23 is a fluorescence plot of the first 
slide,- which contained only L amino acids. Red indicates 
strong binding (149,000 counts or more) while black 
indicates little or no binding of the Herz antibody 
(20,000 counts or less). The bottom right-hand portion 
of the slide appears "cut off" because the slide was 
broken during processing. The sequence YGGFL is clearly 
most strongly recognized. The sequences YAGFL and YSGFL 
also exhibit strong recognition of the antibody. By 
contrast, most of the remaining sequences show little or 
no binding. The four duplicate portions of the slide are 
extremely consistent in the amount of binding shown 
therein. 

Fig. 24 is a fluorescence plot of the second 
slide. Again, strongest binding is exhibited by the 
YGGFL sequence. Significant binding is also detected to 
YaGFL, YsGFL, and YpGFL. The remaining sequences show 
less binding with the antibody. Note the low binding 
efficiency of the sequence yGGFXi. ^- " 

Table 6 lists the various sequences tested 
in order of relative fluorescence, which provides 
information regarding relative binding affinity. 
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Table 6 
Apparent. Binding -to Herz Ab 



T.-a.a. Set 


D-a.a. 1 


YGGFL 


YGGFL 


YAGFL 


YaGFL 


VSGFL 


YsGFL 


liGGFL 9 




FGGFL 


fGGFL 


YPGFL 


yGGFL 


LAGFL 


faGFL 


FAGFL 


wGGFL 


WGGFL 


yaGFL 




fpGFL 




waGFL 



Set 



VIII. Illustrative Alternative Emb odiment 

According to an alternative embodiment of the 
invention, the methods provide for attaching to the 
surface a caged binding member which in its caged form 
has a relatively low affinity for other potentially 
binding species, such as receptors and specific binding 
substances. Such techniques are more fully described 
in copending application Serial No. 404,920, filed 
September 8, 1989, and incorporated herein by reference 
for all purposes. f 

According to this alternative embodiment, the 
invention provides methods for forming predefined regions 
on a surface of a solid support, wherein the predefined 
regions are capable of immobilizing receptors. The 
methods make use of caged binding members attached to the 
surface to enable selective activation of the predefined 
regions. The caged binding members are liberated to act 
as binding members ultimately capable of binding 
receptors upon selective activation of the predefined 
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regions. The activated binding members are then used to 
immobilize specific molecules such as receptors on the 
predefined region of the surface. The above procedure is 
repeated at the same or different sites on the surface so 
as to provide a surface prepared with a plurality of 
regions on the surface containing, for example, the same 
or different receptors • When receptors immobilized in 
this way have a differential affinity for one or more 
ligands, screenings and assays for the ligands can be 
conducted in the regions of **^he surface containing the 
receptors . 

The alternative embodiment may make use of 
novel caged binding members attached to the substrate. 
Caged (unactivated) members have a relatively low 
affinity for receptors of substances that specifically 
bind to uncaged binding members when compared with the 
corresponding affinities of activated binding members. 
Thus, the binding members are protected from reaction 
until a suitable source of energy is applied to the 
regions of the surface desired to be activated. Upon 
application of a suitable energy source, the caging 
groups labilize, thereby presenting the activated binding 
member. A typical energy source will be light. 

Once the binding members on the surface are 
activated they may be attached to a receptor. The 
receptor chosen may be a monoclonal antibody, a nucleic 
acid sequence, a drug receptor, etc. The receptor will 
usually, though not always, be prepared so as to permit 
attaching it, directly or indirectly, to a binding 
member. For example, a specific binding substance having 
a strong binding affinity for the binding member and a 
strong affinity for the receptor or a conjugate of the 
receptor may be used to act as a bridge between binding 
members and receptors if desired. The method uses a 
receptor prepared such that the receptor retains its 
activity toward a particular ligand. 
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Preferably, the caged binding member attached 
to the solid substrate will be a photoactivatable biotin 
complex, i.e., a biotin molecule that has been chemically 
modified with photoactivatable protecting groups so that 
it has a significantly reduced binding affinity for 
avidin or avidin analogs than does natural biotin. In a 
preferred embodiment, the protecting groups localized in 
a predefined region of the surface will be removed upon 
application of a suitable source of radiation to give 
binding members, that are b±Stin or a functionally 
analogous compound having substantially the same binding 
affinity for avidin or avidin analogs as does biotin. 

In another preferred embodiment, avidin or an 
avidin analog is incubated with activated binding members 
on the surface until the avidin binds strongly to the 
binding members. The avidin so immobilized on predefined 
regions of the surface can then be incubated with a 
desired receptor or conjugate of a desired receptor. The 
receptor will preferably be biotinylated, e.g., a 
biotinylated antibody, when avidin is immobilized on the 
predefined regions of the surface. Alternatively, a 
preferred embodiment will present an avidin/biotinylated 
receptor complex, which has been previously prepared, to 
activated binding members on the surface. 

IX. Conclusion 

The present inventions provide greatly improved 
methods and apparatus for synthesis of polymers on 
substrates. It is to be understood that the above 
description is intended to be illustrative and not 
restrictive. Many embodiments will be apparent to those 
of skill in the art upon reviewing the above description. 
By way of example, the invention has been described 
primarily with reference to the use of photoremovable 
protective groups, but it will be readily recognized by 
those of skill in the art that sources of radiation other 
than light could also be used. For example, in some 
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embodiments it may be desirable to use protective 
groups which are sensitive to electron beam irradiation, 
x-ray irradiation , in combination with electron beam 
lithograph, or x-ray lithography techniques* 
Alternatively, the group could be removed by exposure 
to an electric current* 
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A preferred class of photoremovable protecting 
groups has the general formula: 



aw 

R 3 

r 

where R 1 , R 2 , R 3 # and R 4 independently are a hydrogen atom, 
a lower alkyl, aryl, benzyl, halogen, hydroxyl, alkoxyl, 
5 thiol, thioether, amino, nitro, carboxyl, formate, 

formamido or phosphido group, or adjacent substituents 

(i.e., R*-R 2 , R 2 -R 3 , R 3 -R 4 ) are substituted oxygen groups 
that together form a cyclic acetal or ketal ; R 5 is a 
hydrogen atom, a alkoxyl, alkyl, hydrogen, halo, aryl, or 
10 alkenyl group, and n = 0 or 1. 

A preferred protecting group, 6-nitroveratryl 
(NV) , which is used for protecting the carboxyl terminus 
of an amino acid or the hydroxyl group of a nucleotide, 
for example,, is formed when R 2 and R 3 are each a methoxy 
15 group, R 1 , R 4 and R 5 are each a hydrogen atom, and n = O: 




OMe 



A preferred protecting group, 
6-nitroveratryloxycarbonyl (NVOC) , which is used to 
protect the amino terminus of an amino acid, for example, 
is formed when R 2 and R 3 are each a methoxy group, R 1 , R 4 
and R 5 are each a hydrogen atom, and n = 1: 
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Another preferred protecting group, 
6-nitropiperonyl (NP) , which is used for protecting the 
carboxyl terminus of an amino acid or the hydroxyl group 
of a nucleotide, for example,, is formed when R 2 and R 3 
5 together form a methylene acetal, R 1 , R 4 and R 5 are each a 
hydrogen atom, and n - 0: 
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Another preferred protecting group, 
6-nitropiperonyloxycarbonyl (NPOC) , which is used to 
protect the amino terminus of an amino acid, for example, 
is formed when R 2 and R 3 together form a methylene acetal, 
R 1 , R 4 and R 5 are each a hydrogen atom, and n - 1: 




A most preferred protecting group, 
methyl-6-nitroveratryl (MeNV) , which is used for 
protecting the carboxyl terminus of an amino acid or the 
hydroxy 1 group of a nucleotide, for example, is formed 
when R 2 and R 3 are each a methoxy group, R 1 and R 4 are 
each a hydrogen atom, R 5 is a methyl group, and n = 0: 



Me N0 2 




OMe 



Another most preferred protecting group, 
methyl-6-nitroveratryloxycarbonyl (MeNVOC) , which is used 
to protect the amino terminus of an amino acid, for 
example, is formed when R 2 and R 3 are each a methoxy 
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group, R 1 and R 4 are each a hydrogen atom, R 5 is a methyl 
group, and n - 1: 




OMe 



Another most preferred protecting group, 
methyl-6-nitropiperonyl (MeNP) , which is used for 
protecting the, carboxyl terminus of an amino acid or the 
hydroxy 1 group of a nucleotide, for example, is formed 
when R 2 and R 3 together form a methylene acetal, R 1 and R 4 
are each a hydrogen atom, R 5 is a methyl group, and n = 0 




Another most preferred protecting group, 
methyl-6-nitropiperonyloxycarbonyl (MeNPOC) , which is 
used to protect the amino terminus of an amino acid, for 
example, is formed when R 2 and R 3 together form a 
methylene acetal, R 1 and R 4 are each a hydrogen atom, R 5 
is a methyl group, and n » 1: 
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A protected amino acid having a 
photoactivatable oxycarbonyl protecting group, such NVOC 
or NPOC or their corresponding methyl derivatives, MeNVOC 
or MeNPOC, respectively, on the amino terminus is formed 
by acylating the amine of the amino acid with an 
activated oxycarbonyl ester of the protecting group. 
Examples of activated oxycarbonyl esters of NVOC and 
MeNVOC have the general formula: 

O N0 2 9 V 6 f 0 * 
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NVOC-X MeNVOC-X 
where X is halogen, mixed anhydride, phenoxy, 
p-nitrophenoxy, N-hydroxysuccinimide, and the like. . 

A protected amino acid or nucleotide having a 
photoactivatable protecting group, such as NV or NP or 
their corresponding methyl derivatives, MeNV or MeNP, 
respectively, on the carboxy terminus of the amino acid 
or 5 '-hydroxy terminus of the nucleotide, is formed by 
acylating the carboxy terminus or 5 '-OH with an activated 
benzyl derivative of the protecting group. Examples of 
activated benzyl derivatives of MeNV and MeNP have the 
general formula: 

Me NOi 





MeNV-X MeNP - X 
where X is halogen, hydroxyl, tosyl, mesyl, 
trifluormethyl, diazo, azido, and the like. 

Another method for generating protected 
monomers is to react the benzylic alcohol derivative of 
the protecting group with an activated ester of the 
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monomer. For example, to protect the carboxyl terminus 
of an amino acid f an activated ester of the amino acid is 
reacted with the alcohol derivative of the protecting 
group, such as 6-nitroveratrol (NVOH) . Examples of 
activated esters suitable for such uses include 
halo-formate, mixed anhydride, imidazoyl formate, acyl 
halide, and also includes formation of the activated 
ester iri situ the use of common reagents such as DCC and 
the like. See Atherton et al. for other examples of 
activated esters. 

- * A further method for generating protected 
monomers is to react the benzylic alcohol derivative of 
the protecting group with an activated carbon of the 
monomer. For example, to protect the 5' -hydroxy 1 group 
of a nucleic acid, a derivative having a 5* -activated 
carbon is reacted with the alcohol derivative of the 
protecting group, such as methyl-6-nitropiperonol 
(MePyROH) . Examples of nucleotides having activating 
groups attached to the 5 1 -hydroxy 1 group have the general 
formula: 




where Y is a halogen atom, a tosyl, mesyl, 

trif luoromethyl , azido, or diazo group, and the like. 

Another class of preferred photochemical 
protecting groups has the formula: 




where R 1 , R 2 , and R 3 independently are a hydrogen atom, a 
lower alkyl, aryl, benzyl, halogen, hydroxyl, alkoxyl, 
thiol, thioether, amino, nitro, carboxyl, formate, 
formamido, sulfanates, sulfido or phosphido group, R 4 and 
R 5 independently are a hydrogen atom, an alkoxy, alkyl, 
halo, aryl, hydrogen, or alkenyl group, and n « o or 1. 

A preferred protecting group, 
1-pyrenylmethyl oxycarbonyl (PyROC) , which is used to 
protect the amino terminus of an amino acid, for example, 
is formed when R 1 through R 5 are each a hydrogen atom and 
n 




O 



Another preferred protecting group, 
1-pyrenylmethyl (PyR) , which is used for protecting the 
carboxy terminus of an amino acid or the hydroxyl group 
of a nucleotide, for example, is formed when R 1 through R 5 
are each a hydrogen atom and n - 0: 




An amino acid having a pyrenylmethyloxycarbonyl 
protecting group on its amino terminus is formed by 
acylation of the free amine of amino acid with an 
activated oxycarbonyl ester of the pyrenyl protecting 
group. Examples of activated oxycarbonyl esters of PyROC 
have the general formula: 
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where X is halogen, or mixed anhydride, p-nitrophenoxy , 
or N-hydroxysuccinimide group, and the like. 

A protected amino acid or nucleotide having a 
photoactivatable protecting group, such as PyR, on the 
carboxy terminus of the amino acid or 5 1 -hydroxy terminus 
of the nucleic acid, respectively, is formed by acylating 
the carboxy terminus or 5 1 -OH with an activated 
pyrenylmethyl derivative of the protecting group. 
Examples of activated pyrenylmethyl derivatives of PyR 
have the general formula: 




where X is a halogen atom, a hydroxyl, diazo, or azido 
group, and the like. 

Another method of generating protected monomers 
is to react the pyrenylmethyl alcohol moiety of the 
protecting group with an activated ester of the monomer. 
For. example, an activated ester of an amino acid can be 
reacted with the alcohol derivative of the protecting 
group, such as pyrenylmethyl alcohol (PyROH) , to form the 
protected derivative of the carboxy terminus of the amino 
acid. Examples of activated esters include halo-formate, 
mixed anhydride, imidazoyl formate, acyl halide, and also 

72 



includes formation of the activated ester in situ and the 
use of common reagents such as DCC and the like. 

Clearly, many photosensitive protecting groups 
are suitable for use in the present invention. 
5 In preferred embodiments, the substrate is 

irradiated to remove the photoremovable protecting groups 
and create regions having free reactive moieties arid side 
products resulting from the protecting group. The 
removal rate of the protecting groups depends on the 

10 wavelength and intensity of the incident radiation, as 
well "as the physical and chemical properties of the 
protecting group itself. Preferred protecting groups are 
removed at a faster rate and with a lower intensity of 
radiation. For example, at a given set of conditions, 

15 MeNVOC and MeNPOC are photolytically removed from the 
N-terminus of a peptide chain faster than their 
unsubstituted parent compounds, NVOC and NPOC, 
respectively. 

Removal of the protecting .group is accomplished 

20 *>Y irradiation to liberate the reactive group and 

degradation products derived from the protecting group. 
Not wishing to be bound by theory, it is believed that 
irradiation of an NVOC- and MeNVOC-protected oligomers 
occurs by the following reaction schemes: 

25 

NVOC-AA -> 3 , 4-dimethoxy-6-nitrosobenz aldehyde + C0 2 + AA 
MeNVOC-AA-> 3 , 4-dimethoxy~6-nitrosoacetophenone + C0 2 + AA 

where AA represents the N-terminus of the amino acid 
30 oligomer. 

Along with the unprotected amino acid, other 
products are liberated into solution: carbon dioxide and 
a 2,3-dimethoxy-6-nitrosophenylcarbonyl compound, which 
can react with nucleophilic portions of the oligomer to 
35 form unwanted secondary reactions. In the case of an 

NVOC-protected amino acid, the degradation product is a 
nitrosobenzaldehyde, while the degradation product for 
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the other is a nitrosophenyl ketone. For instance, it is 
believed that the product aldehyde from NVOC degradation 
reacts with free amines to form a Schiff base (imine) 
that affects the remaining polymer synthesis. Preferred 
photoremovable protecting groups react slowly or 
reversibly with the oligomer on the support. 

Again not wishing to be bound by theory, it is 
believed that the product ketone from irradiation of a 
MeNVOC-protected oligomer reacts at a slower rate with 
nucleophiles on the oligomer than the product aldehyde 
from -irradiation of the same NVOC-protected oligomer. 
Although not unambiguously determined, it is believed 
that this difference in reaction rate is due to the 
difference in general reactivity between aldehyde and 
ketones towards nucleophiles due to steric and electronic 
effects . 

The photoremovable protecting groups of the 
present invention are readily removed. For example, the 
photolysis of N-protected L-phenylalanine in solution and 
having different photoremovable protecting groups was 
analyzed, and the results are presented in the following 
table: 



Table 

Photolysis of Protected L-Phe-OH 









in seconds 




Sovlent 


NBOC 


NVOC 


MeNVOC 


MeNPOC 


Dioxane 


1288 


110 


24 


19 


5mM H^O^Dioxane 


1575 


98 


33 


22 



The half life, t 1/2 is the time in seconds 
required to remove 50% of the starting amount of 
protecting group. NBOC is the 6-nitrobenzyloxycarbonyl 
group, NVOC is the 6-nitroveratryloxycarbonyl group, 
MeNVOC is the methyl-6-nitroveratryloxycarbonyl group, 
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and MeNPOC is the methyl-6-nitropiperonyloxycarbonyl 
group* The photolysis was carried out in the indicated 
solvent with 362/364 ran- wavelength irradiation having an 
intensity of 10 mW/cm 2 , and the concentration of each 
protected phenylalanine was 0.10 mM. 

The table shows that deprotection of NV0C-, 
MeNVOC-, and MeNPOC-protected phenylalanine proceeded 
faster than the deprotection of NBOC. Furthermore, it 
shows that the deprotection of the two derivatives that 
are substituted on the benzyl ic carbon, MeNVOC and 
MeNPOC, were photolyzed at the highest rates in both 
dioxane and acidified dioxane. 

1 • Use of Photoremovable Groups Purina 
Solid-Phase Synthesis of Peptides 

The formation of peptides on a solid-phase 

support requires the stepwise attachment of an amino acid 

to a substrate-bound growing chain. In order to prevent 

unwanted polymerization of the monomeric amino acid under 

the reaction conditions, protection of the amino terminus 

of the amino acid is required. After the monomer is 

coupled to the end of the peptide, the N-terminal 

protecting group is removed, and another amino acid is 

coupled to the chain. This cycle of coupling and 

deprotecting is continued for each amino acid in the 

peptide sequence. See Merrifield, J. Am. Chem. Soc. 

(1963) 85:2149, and Atherton et al . , "Solid Phase 

Peptide Synthesis" 1989, IRL Press, London, both 

incorporated herein by reference for all purposes. As 

described above, the use of a photoremovable protecting 

group allows removal of selected portions of the 

substrate surface, via patterned irradiation, during the 

deprotection cycle of the solid phase synthesis. This 

selectively allows spatial control of the synthesis — the 

next amino acid is coupled only to the irradiated areas. 
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In one embodiment, the photoremovable 
protecting groups of the present invention are attached 
to an activated ester of an amino acid at the amino 
terminus : 




NH-X 



R 



where'R is the side chain of a natural or unnatural amino 
acid, X is a photoremovable protecting group, and Y is art 
activated carboxylic acid derivative. The photoremovable 
protecting group, X, is preferably NVOC, NPOC, PyROC, 
MeNVOC, MeNPOC, and the like as discussed above. The 
activated ester, Y, is preferably a reactive derivative 
having a high coupling efficiency, such as an acyl 
halide, mixed anhydride, N-hydroxysuccinimide ester, 
perf luorophenyl ester, or urethane protected acid, and 
the like. Other activated esters and reaction conditions 
are well known (See Atherton e£ al . ) . 

2. Use of Photoremovable Groups During 

Solid-Phase Synthesis of Oligonucleotides 

The formation of oligonucleotides on a 

solid-phase support requires the stepwise attachment of a 

nucleotide to a substrate-bound growing oligomer. In 

order to prevent unwanted polymerization of the monomeric 

nucleotide under the reaction conditions, protection of 

the 5 1 -hydroxy 1 group of the nucleotide is required. 

After the monomer is coupled to the end of the oligomer, 

the 5 1 -hydroxy 1 protecting, group is removed, and another 

nucleotide is coupled to the chain. This cycle of 

coupling and deprotecting is continued for each 

nucleotide in the oligomer sequence. See Gait, 

"Oligonucleotide Synthesis: A Practical Approach" 1984, 

IRL Press, London, incorporated herein by reference for 
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all purposes. As described above, the use of a 
photoremovable protecting group allows removal, via 
patterned irradiation, of selected portions of the 
substrate surface during the deprotection cycle of the 
solid phase synthesis . This selectively allows spatial 
control of the synthesis — the next nucleotide is coupled 
only to the irradiated areas. 

Oligonucleotide synthesis generally involves 
coupling an activated phosphorous derivative on the 
3 1 -hydroxyl group of a nucleotide with the 5 1 -hydroxyl 
group of an oligomer bound to a solid support. Two major 
chemical methods exist to perform this coupling: the 
phosphate-triester and phosphoamidite methods (See Gait) . 
Protecting groups of the present invention are suitable 
for use in either method. 

In a preferred embodiment, a photoremovable 
protecting group is attached to an activated nucleotide 
on the 5 1 -hydroxyl group: 




where B is the base attached to the sugar ring; R is a 
hydrogen atom when the sugar is deoxyribose or R is a 
hydroxyl group when the sugar is ribose; P represents an 
activated phosphorous group; and X is a photoremovable 
protecting group. The photoremovable protecting group, 
X, is preferably NV, NP, PyR, MeNV, MeNP, and the like 
as described above. The activated phosphorous group, P, 
is preferably a reactive derivative having a high 
coupling efficiency, such as a phosphate-triester, 
phosphoamidite or the like. Other activated phosphorous 
derivatives, as well as reaction conditions, are well 
known (See Gait) . 
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E. Amino Acid N-Carboxv Anh ydrides 

Protected With a Photoremovab le Group 

During Merrifield peptide synthesis, an 
activated ester of one amino acid is coupled with the 
free amino terminus of a substrate-bound oligomer. 
Activated esters of amino acids suitable for the solid 
phase synthesis include halo-formate, mixed anhydride, 
imidazoyl formate, acyl halide, and also includes 
formation of the activated ester In situ and the 
use of common reagents such as DCC and the like 
(See JVtherton et al.) . A preferred protected and 
activated amino acid has the general formula: 



O 




where R is the side chain of the amino acid and X is a 
photoremovable protecting group. This compound is a 
urethane-protected amino acid having a photoremovable 
protecting group attach to the amine. A more preferred 
activated amino acid is formed when the photoremovable 
protecting group has the general formula: 




where R 1 , R 2 , R 3 , and R 4 independently are a hydrogen atom, 
a lower alkyl, aryl, benzyl, halogen, hydroxy 1, alkoxyl, 
thiol, thioether, amino, nitro, carboxyl, formate, 
formamido or phosphido group, or adjacent substituents 
(i.e., R x -R 2 , R 2 -R 3 , R 3 -R 4 ) are' substituted oxygen groups 
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that together form a cyclic acetal or ketal; and R 5 is a 
hydrogen atom, a alkoxyl, alkyl, hydrogen, halo, aryl, or 
alkenyl group. 

A preferred activated amino acid is 
formed when the photoremovable protecting group is 
6-nitroveratryloxycarbonyl. That is, R 1 and R 4 are each a 
hydrogen atom, R 2 and R 3 are each a methoxy group, and R 5 
is a hydrogen atom* Another preferred activated amino 
acid is formed when the photoremovable group is 
6-nitropiperonyl: R 1 and R 4 are each a hydrogen atom, R 2 
and-R 3 together form a methylene acetal, and R 5 is a 
hydrogen atom. Other protecting groups are possible. 
Another preferred activated ester is formed when the 
photoremovable group is methyl- 6-nitroveratryl or methyl- 
6-nitropiperonyl . 

Another preferred activated amino acid is 
formed when the photoremovable protecting group has the 
general formula: 

R 4 R 5 

where R 1 , R 2 , and R 3 independently are a hydrogen atom, a 
lower alkyl, aryl, benzyl, halogen, hydroxyl, alkoxyl, 
thiol, thioether, amino, nitro, carboxyl, formate, 
formamido, sulfanates, sulfido or phosphido group, and R 4 
and R 5 independently are a hydrogen atom, an alkoxy, 
alkyl, halo, aryl, hydrogen, or alkenyl group. The 
resulting compound is a urethane-protected amino acid 
having a pyrenylmethyloxycarbonyl protecting group 
attached to the amine. A more preferred embodiment is 
formed when R 1 through R 5 are each a hydrogen atom. 

The urethane-protected amino acids having a 
photoremovable protecting group of the present invention 
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are prepared by condensation of an N-protected amino acid 
with an acylating agent such as an acyl halide, 
anhydride, chlorof ormate and the like (See Fuller et al. , 
U.S. Patent No. 4,946,942 and Fuller et al. , J. Airier. 
5 Chenu Soc. (1990) 112:7414-7416, both herein incorporated 
by reference for all purposes) . 

Urethane-protected amino acids having 
photoremovable protecting groups are generally useful as 
reagents during solid-phase peptide synthesis, and 

10 because of the spatially selectivity possible with the 
photoremovable protecting group, are especially useful 
for the spatially addressing peptide synthesis. These 
amino acids are dif unctional : the urethane group first 
serves to activate the carboxy terminus for reaction with 

15 the amine bound to the surface and, .once the peptide bond 
is formed, the photoremovable protecting group protects 
the newly formed amino terminus from further reaction. 
These amino acids are also highly reactive to 
nucleophiles, such as deprotected amines on the surface 

20 of the solid support, and due to this high reactivity, 

the solid-phase peptide coupling times are significantly 
reduced, and yields are typically higher. 
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1. Example 

Light activated formation of a thymidine- 
cytidine dimer was carried out. A three dimensional 
representation of a fluorescence scan showing a 
checkboard pattern generated by the light-directed 
synthesis of a dinucleotide is shown in Fig. 8. 
5 ! -nitroveratryl thymidine was attached to a synthesis 
substrate through the 3 1 hydroxyl group. The 
nitroveratryl protecting groups were removed by 
illumination through a 500 mm checkerboard mask. The 
substrate was then treated with phosphoramidite activated 
2 '-deoxycytidine. In order to follow the reaction 
f luorometrically , the deoxycytidine had been modified 
with an FMOC protected aminohexyl linker attached to the 
exocyclic amine (5 1 -o-dimethoxytrityl-4-N- (6-N- 
fluorenylmethylcarbamoyl-hexylcarboxy) -2 • -deoxycytidine) . 
After removal of the FMOC protecting group with base, the 
regions which contained the dinucleotide were 
fluorescently labelled by treatment of the substrate with 
1 mM FITC in DMF for one hour. 

The three-dimensional representation of the 
fluorescent intensity data in Fig. 14 clearly reproduces 
the checkerboard illumination pattern used during 
photolysis of the substrate. This result demonstrates 
that oligonucleotidesas well as peptides can be 
synthesized by the light-directed method. 
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C Binary Masking 

In fact f the means for producing a substrate useful 
for these techniques are explained in U.S. S.N. 07/492,462 
(VLSIPS CIP) , which is hereby incorporated herein by. reference. 



VLSIPS) . 

Briefly, the binary synthesis strategy refers to an 
10 ordered strategy for parallel synthesis of diverse polymer 
sequences by sequential addition of reagents which may be 



product of which is a product matrix. A reactant matrix is a 
1 x n matrix of the building blocks to be added. The switch 

15 matrix is all or a subset of the' binary numbers from 1 to n 
arranged in columns. In preferred embodiments , a binary 
strategy is one in which at least two successive steps 
illuminate half of a region of interest on the substrate. In 
most preferred embodiments , binary synthesis refers to a 

20 synthesis strategy which also factors a previous addition step. 
. For example, a strategy in which a switch matrix for a masking 
strategy halves regions that were previously illuminated, 
illuminating about half of the previously illuminated region 
and protecting the remaining half (while also protecting about 

25 half of previously protected regions and illuminating about 

half of previously protected regions) . It will be recognized 
that binary rounds may be interspersed with non-binary rounds 
and that only a portion of a substrate may be subjected to a 
binary scheme, but will still be considered to be a binary 

30 masking scheme within the definition herein. A binary 

"masking 11 strategy is a binary synthesis which uses light to 
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represented by a reactant matrix, and a switch matrix, the 
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remove protective groups from materials for addition of other 
materials such as nucleotides or amino acids. 

In particular, this procedure provides a simplified 
and highly efficient method for saturating all possible 
sequences of a defined length polymer. This masking strategy 
is also particularly useful in producing all possible 
oligonucleotide, sequence probes of a given length. 

The technology provided by the present invention has 
very broad applications. Although described specifically .for 
polynucleotide sequences, similar sequencing, fingerprinting, 
mapping, and screening procedures can be applied to 
polypeptide, carbohydrate, or other polymers. In particular, 
the present invention may be used to completely sequence a 
given target sequence to subunit resolution. This may be for 
de novo sequencing, or may be used in conjunction with a second 
sequencing procedure to provide independent verification. See, 
e.g., (1988) Science 242:1245. For example, a large 
polynucleotide sequence defined by either the Maxam and Gilbert 
technique or by the Sanger technique may be verified by using, 
the present invention. 

In addition, by selection of appropriate probes, a 
polynucleotide sequence can be fingerprinted. Fingerprinting 
is a less detailed sequence analysis which usually involves the 
characterization of a sequence by a combination of defined 
features. Sequence fingerprinting is particularly useful 
because the repertoire of possible features which can be tested 
is virtually infinite. Moreover, the stringency of matching is 
also variable depending upon the application. A Southern Blot 
analysis may be characterized as a means of simple fingerprint 
analysis. 

Fingerprinting analysis may be performed to the 
resolution of specific nucleotides, or may be used to determine 
homologies, most commonly for large segments. In particular, 
an array of oligonucleotide probes of virtually any workable 
size may be positionally localized on a matrix and used to 
probe a sequence for either absolute complementary matching, or 
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homology to the desired level of stringency using selected 
hybridization conditions. 

In addition, the present invention provides means for 
mapping analysis of a target sequence or sequences. Mapping 
will usually involve the sequential ordering of a plurality of 
various sequences, or may involve the localization of a 
particular sequepce within a plurality of sequences* This may 
be achieved by immobilizing particular large segments onto the 
matrix and probing with a shorter sequence to determine which 
of the large sequences contain that smaller sequence. 
Alternatively, relatively shorter probes of known or random 
sequence may be immobilized to the matrix and a map of various 
different target sequences may be determined from overlaps. 
Principles of such an approach are described in some detail by 
Evans et al. (1989) "Physical Mapping of Complex Genomes by 
Cosmid Multiplex Analysis," Proc. Natl . Acad. Sci. USA 86:5030- 
5034; Michiels et al. (1987) "Molecular Approaches to Genome 
Analysis: A Strategy for the Construction of Ordered Overlap 
Clone Libraries , " CABIOS 3:203-210; Olsen et al. (1986) 
"Random-Clone Strategy for Genomic Restriction Mapping in 
Yeast," Proc. Natl. Acad. Sci. USA 83:7826-7830; Craig, et al. 
(1990) "Ordering of Cosmid Clones Covering the Herpes Simplex 
Virus Type I (HSV-I) Genome: A Test Case for Fingerprinting by 
Hybridization, " Nuc-. Acids Res. 18:2653-2660; and Coulson r et 
al. (198 6) "Toward a Physical Map of the Genome of the Nematode 
Caenorhabditis elegans," Proc. Natl. Acad. Sci. USA 83:7821- 
7 825; each of which is hereby incorporated herein by reference. 

Fingerprinting analysis also provides a means of 
identification. In addition to its value in apprehension of 
criminals from whom a biological sample, e.g., blood r has been 
collected, fingerprinting can ensure personal identification 
for other reasons. For example, it may be useful for 
identification of bodies in tragedies such as fire, flood, and 
vehicle crashes. In other cases the identification may be 
useful in identification of persons suffering from amnesia, or 
of missing persons. Other forensics applications include 
establishing the identity of a person, e.g., military 
identification "dog tags", or may be used in identifying the 
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source of particular biological samples. Fingerprinting 
technology is described, e.g., in Carrano, et al. (1989) "A 
High-Resolution, Fluorescence-Based, Semi-automated method for 
DNA Fingerprinting," Genomics 4: 129-136, which is hereby 
incorporated herein by reference. See, e.g., table I, for 
nucleic acid applications, and corresponding applications may 
be accomplished using polypeptides. 
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TABLE I - . 
VLSI PS PROJECT IN NUCLEIC ACIDS 
Construction of Chips 
Applications 

A. -Sequencing 

1. Primary sequencing 

1 2. ^Secondary sequencing (sequence checking) 

3. Large scale mapping 

4 . Fingerprinting 

B. Duplex/Triplex formation 
- 1. Antisense 

2. Sequence specific function modulation 
(e.g. promoter inhibition) 

C. Diagnosis 

1. Genetic markers 

2 . Type markers 

a. Blood donors 

b. Tissue transplants 

D. Microbiology 

1. Clinical microbiology 

2. Food microbiology 

. Instrumentation 

A. Chip machines 

B. Detection 

S o f tware De ve 1 opment 

A. Instrumentation software 

B. Data reduction software 

C. Sequence analysis software 
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The fingerprinting analysis may be used to perform 
various types of genetic screening. For example, a single 
substrate may be generated with a plurality of screening 
probes, allowing for the simultaneous genetic screening for a 
large number of genetic markers* Thus, prenatal or diagnostic 
screening can be simplified, economized, and made more 
generally accessible. 

In addition to the sequencing, fingerprinting, and 
mapping applications, the present invention also provides means 
for determining specificity of interaction with particular 
sequences. Many of these applications were described in 
U.S. S.N. 07/362,901 (VLSIPS parent), U. S.S.N. 07/49^,462 
.(VLSIPS CIP) , U.S. S.N. 07/435,316 (caged biotin parent), and 
U.s.s.N." 07/612,671 (caged biotin CIP). 

E. Detection Methods and Apparatus 
An appropriate detection method applicable to the 
selected labeling method can be selected. Suitable labels 
include radionucleotides , enzymes, substrates, cof actors , 
inhibitors, magnetic particles, heavy metal atoms, and 
particularly fluorescers, chemiluminescers , and spectroscopic 
labels. Patents teaching the use of such labels include U.S. 
Patent Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 
4,277,437; 4,275,149; and 4,366,241. 

With an appropriate label selected, the detection 
system best adapted for high resolution and high sensitivity 
detection may be selected. As indicated above, an optically 
detectable system, e.g., fluorescence or chemiluminescence 
would be preferred. other detection, systems may be adapted to 
the purpose, e.g., electron microscopy, scanning electron 
microscopy (SEM) , scanning tunneling electron microscopy 
(STEM) , infrared microscopy, atomic force microscopy (AFM) , 
electrical condutance, and image plate transfer. 

With a detection method selected, an apparatus* for 
scanning the substrate will be designed. Apparatus, as 
described in U.S. S.N. 07/362,901 (VLSIPS parent); or U.S. S.N. 

07/492,462 (VLSIPS CIP); orU.S.S.N. / , , attorney 

docket number 11509-28 (automated VLSIPS), are particularly 
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appropriate* Design modifications may also be incorporated 
therein. 



F. Data Analysis 

Data is analyzed by processes similar to those 
described below in the section describing theoretical analysis. 
More efficient algorithms will be mathematically devised, and 
will usually be designed to be performed on a computer. 
Various computer programs which may more quickly or efficiently 
make measurement samples and distinguish signal from noise will 

also be devised. See,, particularly, U.S. S.N. / ^ , 

attorney docket number 11509-28 (automated VLSIPS) . 

The initial data resulting from the detection system 
is an array of data indicative of fluorescent intensity versus 
location on the substrate. The data are typically taken over 
regions substantially smaller than the area in which synthesis 
of a given polymer has taken place. Merely by way of example, 
if polymers were synthesized in squares on the substrate having 
dimensions of 500 microns by 500 microns, the data may be taken 
over regions having dimensions of 5 microns by 5 microns. In 
most preferred embodiments, the regions over which florescence 
data are taken across the substrate are less than about 1/2 the 
area of the regions in which individual polymers are 
synthesized, preferably less than 1/10 the area' in which a 
single polymer is synthesized, and most preferably less than 
1/100 the area in which a single polymer is synthesized. 
Hence, within any area in which a given polymer has been 
synthesized, a large number of fluorescence data points are 
collected. 

A plot of number of pixels versus intensity for a 
scan should bear a rough resemblance to a bell curve, but 
spurious data are observed, particularly at higher intensities. 
Since it is desirable to use an average of fluorescent 
intensity over a given synthesis region in determining relative 
binding affinity, these spurious data will tend to undesirably 
skew the data. 

Accordingly, in one embodiment of the invention the 
data are corrected for removal of these spurious data points, 
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and an average of the data points is - thereafter utilized in 
determining relative binding efficiency. In general the data 
are fitted, to a base curve and statistically measures are used 
to remove spurious data* 

In an additional analytical tool, various degeneracy 
reducing analogues may be incorporated in the hybridization 
probes. Various aspects of this strategy are described , e.g., 
in Macevicz, S. (1990) PCT publication number WO 90/04652, 
which is hereby incorporated herein by reference. 

II. ' THEORETICAL ANALYSIS 

The principle of the hybridization sequencing 
procedure is based, in part, upon the ability to determine 
overlaps of short segments. The VLSIPS technology provides the 
ability to generate 'reagents which will saturate the possible 
short subsequence recognition possibilities. The principle is 
most easily illustrated by using a binary sequence, such as a 
sequence of zeros and ones. Once having illustrated the 
application to a binary alphabet, the principle may easily be 
understood to encompass three letter, four letter, five or more- 
letter, even 20 letter alphabets. A theoretical treatment of 
analysis of subsequence information. to reconstruction of a 
target sequence is provided, e.e., in Lysov, Yu. , et al. (1988) 
Dokiadv Akademi. Nauk. SSR 3 03:1508-1511; Khropko K. , et al. 
(1989) FEBS Letters 256:118-122; Pevzner, P. (1989) J. of 
Biomolecular Structure and Dynamics 7:63-69; and Drmanac, R. et 
al. (1989) Genomics 4:114-128; each of which is hereby 
incorporated herein by reference. 

The reagents for recognizing the subsequences will 
usually be specific for recognizing a particular polymer 
subsequence anywhere within a target polymer. It is preferable 
that conditions may be devised which allow absolute 
discrimination between high fidelity matching and very low 
levels of mismatching. The reagent interaction will preferably 
exhibit no sensitivity to flanking sequences, to the 
subsequence position within the target, or to any other remote 
structure within the sequence. For polynucleotide sequencing, 
the specific reagents can be oligonucleotide probes; for 
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polypeptides and carbohydrates , antibodies will be useful 
reagents. Antibody reagents should also be useful for other 
types of polymers. 

A. Simple n-mer structure: Theory 

1. Simple two letter alphabet: example 

A simple example is presented below of how a sequence 
of ten digits comprising zeros and ones would be sequenceable 
using short segments of five digits. For example, consider the 
sample -ten digit sequence: 

1010011100. 

A VLSIPS substrate could be constructed, as discussed 
elsewhere, which would have reagents attached in a defined 
matrix pattern which specifically recognize each of the 
possible five digit sequences of ones and zeros. The number of 
possible five digit subsequences is 2 = 32. The number of 
possible different sequences 10 digits long is 2 10 = 1,024, 
The five contiguous digit subsequences within a ten digit 
sequence number six f i.e., positioned at digits 1-5, 2-6, 3-7, 
4-8, 5-9, and 6-10. It will be noted that the specific order 
of the digits in the sequence is important and that the order 
is directional, e.g., running left to right versus right to 
left. The first five digit sequence contained in the target 
sequence is 10100. The second is 01001, the third is 10011, 
the fourth is 00111, the fifth is OHIO, and the sixth is 
11100. 

The VLSIPS substrate would have a matrix pattern of 
positionally attached reagents which recognize each of the 
different 5-mer subsequences. Those reagents which recognize 
each of the 6 contained 5-mers will bind the target, and a 
label allows the positional determination of where the sequence 
specific interaction has occurred. By correlation of the 
position, in the matrix pattern, the corresponding bound 
subsequences can be determined. 
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In the above-mentioned sequence, six different 5-mer 

sequences would be determined to be present. They would be: 

10100 
01001 
10011 
00111 
•OHIO 
11100 

Any sequence which contains the first five digit 
sequence, 10100, already narrows the number of possible 
sequences (e.g. , from 1024 possible sequences) which contain it 
to less than about 192 possible sequences. 

This 192 is derived from the observation that with 
the subsequence 10100 at the far left of the sequence, in 
positions 1-5 , there are only 32 possible sequences. Likewise, 
for that particular .subsequence in positions 2-6, 3-7, 4-8, 5— 
9, and 6-10. So, to sum up all of the sequences that could 
contain 10100, there are 32 for each position and 6 positions 
for a total of about 192 possible sequences. However, some of 
these 10 digit sequences will have been counted twice. Thus, 
by virtue of containing the 10100 subsequence, the number of 
possible 10-mer sequences has been decreased from 1024 
sequences to less than about 192 sequences. 

In this example, not only do we know that sequence 
contains 10100, but we also know that it contains the second 
five character sequence, 01001. By virtue of knowing that the 
sequence contains 10100, we can look specifically to determine 
whether the sequence contains a subsequence of five characters 
which contains the four leftmost digits plus a next digit to 
the left. For example, we would look for a sequence of X1010, 
but we find that there is none* Thus, we know that the 10100 
must be at the left end of the 10-mer. We would also look to 
see whether the sequence contains the rightmost four digits 
plus a next digit to the right, e.g., 0100X. We find that the 
sequence also contains the sequence 01001, and that.X is a 1. 
Thus, we know at least that our target sequence has an overlap 
of 0100 and has the left terminal sequence 101001. 

Applying the same procedure to the second 5-mer, we 
also know that the sequence must * include a sequence of five 
digits having the sequence 1001Y where Y must be either 0 or 1. 
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We look through the fragments and we- see that we have a 10011 
sequence within our target r thus Y is also 1. Thus, we would 
know that our sequence has a sequence of the first seven being 
1010011. 

Moving to the next 5-mer, we know that there must be 
a sequence of 0011Z, where 2 must be either 0 or l. We look at 
the fragments produced above and see that the .target sequence 
contains a 00111 subsequence and 2 is 1. Thus f we know the 
sequence must start with 10100111. 

- . The next 5-mer must be of the sequence 0111W where W 
must be 6 or 1. Again, looking up at the fragments* produced , 
we see that the target sequence contains a OHIO subsequence, 
and W is a 0.- Thus, our sequence to this point is 101001110 . 
We know that the last 5-mer must be either 11100 or 11101. 
Looking above, we see that it is 11100 and that must be the 
last of our sequence. Thus, we have determined that our 
sequence must have been 1010011100. 

However, it will be recognized from the example above 
with the sequences provided therein, that the sequence analysis 
can start with any known positive probe subsequence. The 
determination may be performed by moving linearly along the 
sequence checking the known sequence with a limited number of 
next positions. Given this possibility, the sequence may be 
determined, besides by scanning all possible oligonucleotide 
probe positions, by specifically looking only where the next 
possible positions would be. This may increase the complexity 
of the scanning but may provide a longer time span dedicated 
towards scanning and detecting specific positions of interest 
relative to other sequence possibilities. Thus, the scanning 
apparatus could be set up to work its way along a sequence from 
a given contained oligonucleotide to only look' at those 
positions on the substrate which are expected to have a 
positive signal. 

It is seen that given a sequence, it can be de- 
constructed into n-mers to produce a set of internal contiguous 
subsequences. From any given target sequence, we would be able 
to determine what fragments would result. The hybridization 
sequence method depends, in part, upon being able to work in 
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the reverse, from a set of fragments- of known sequences to the 
full sequence. In simple cases, one is able to start at a 
single position and work in either or both directions towards 
the ends of the sequence as illustrated in the example. 

The number of possible sequences of a given length 
increases very quickly with the length of that sequence. Thus, 
a 10-mer of zeros and ones- has 1024 possibilities, a 12-mer has 
4096. A 20-mer has over a million possibilities, and a 30-mer 
has over a billion. However, a given 30-mer has, at most, 26 
different internal 5-mer sequences. Thus, a 30 character 
target sequence having over a million possible sequences can be 
substantially defined by only 2 6 different 5-mers. _ It will be 
recognized that the probe oligonucleotides will preferably, but 
need not necessarily, be of identical length, and that the 
5 probe sequences need not necessarily be contiguous in that the 
overlapping subsequences need not differ by only a single 
subunit. Moreover, each position of the matrix pattern need 
not be homogeneous, but may actually contain a plurality of 
probes of known sequence. In addition, although all of the 
0 possible subsequence specifications would be preferred, a less 
than full set of sequences specifications could be used. In 
particular, although a substantial fraction will preferably be 
at least about 70%, it may be less than that. About 20% would 
be preferred, more preferably at least about 3 0% would be 
25 desired. Higher percentages would be especially preferred. 

2. Fxample of four letter alphabet 
A four letter alphabet may be conceptualized in at 
least two different ways from the two letter alphabet. One 
30 way, is to consider the four possible values at each position 
and' to analogize in a similar fashion to the binary example 
each of the overlaps. A second way is to group the binary 

digits into groups. 

Using the first means, the overlap comparisons are 
3 5 performed with a four letter alphabet rather than a two letter 
alphabet. Then, in contrast to the binary system with 10 
positions where 2 10 = 1024 possible sequences, in a 4-character 
alphabet with 10 positions, there will actually be 4 10 = 

93 



l f 048, 576 possible sequences. Thus / -the complexity of a four 
character sequence has a much larger number of possible 
sequences compared to a two character sequence. Note, however, 
that there are still only 6 different internal 5-mers. For 
simplicity, we shall examine a 5 character string with 3 
character subsequences. Instead of only 1 and 0, the 
characters may* 7 be designated, e.g. , A, C, G, and T. 'Let us 
take the sequence GGCTA. The 3~mer subsequences are: 

GGC 
GCT 
CTA 

Given these subsequences , there is one sequence , or at most 

only a few sequences which would produce that combination of 

subsequences, i.e., GGCTA. 

Alternatively, with a four character universe, the 

binary system can be looked at in pairs of digits. The pairs 

would be 00, 01, 10, and 11. In this manner, the earlier used 

sequence 1010011100 is . looked at as 10,10,01,11,00. Then the 

first character of two digits is selected from the possible 

universe of the four representations 00, 01, 10, and 11. Then 

a probe would be in an even number of digits, e.g., not five 

digits, but, three pairs of digits or six digits. A similar 

comparison is performed and the possible overlaps determined. 

The 3 -pair subsequences are: 

10,10,01 

10, 01, 11 

01,11,00 

and the overlap reconstruction produces 10,10,01,11,00. 

The latter of the two conceptual views of the 4 
letter alphabet provides a representation which is similar to 
what would be provided in a digital computer. The 
applicability to a four nucleotide alphabet is easily seen by 
assigning, e.g., 00 to A, 01 to C, 10 to G, and 11 to T. And, 
in fact, if such a correspondence is used, both examples for 
the 4 character sequences can be seen to represent the same 
target sequence. The applicability of the hybridization method 
and its analysis for determining the ultimate sequence is 
easily seen if A is the representation of adenine, c is the 
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representation of cytosine f G is the - representation of guanine, 
and T is the representation of thymine or uracil. 

3. Generalization to m-letter Alphabet 
This reconstruction process may be applied to 
polymers of virtually any number of possible characters in the 
alphabet, and for virtually any length sequence to be 
sequenced, though limitations, as discussed below, will limit 
its efficiency at various extremes of length. it will be 
recognized that the theory can be applied to a large diversity 
of systems where sequence is important. 

For example , the method could be applied to 
sequencing of a polypeptide. A polypeptide can have any of 
twenty natural amino acid possibilities at each position. A 
twenty letter alphabet is amenable to sequencing by this method 
so long as reagents exist for recognizing shorter subsequences 
therein. A preferred reagent for achieving that goal would be 
a set of monoclonal antibodies each of which recognizes a 
specific three contiguous amino acid subsequence. A complete 
set of antibodies which recognize all possible subsequences of 
a given length, e.g., 3 amino acids, and preferably with a 
uniform affinity, would be 20^ =' 8000 reagents. 

It will also be recognized that each target sequence 
which is recognized by the specific reagents need not have 
homogeneous termini. Thus, fragments of the entire target 
sequence will also be useful for hybridizing appropriate 
subsequences. It is, however, preferable that there not be a 
significant amount of labeled homogeneous contaminating 
extraneous sequences. This constraint does usually require the 
purification of the target molecule to be sequenced, but a 
specific label technique would dispense with a purification 
requirement if the unlabeled extraneous sequences do not 
interfere with the labeled sequences. 

In addition, conformational effects of target 
polypeptide folding may, in certain embodiments, be negligible 
if the polypeptide is fragmented into sufficiently small 
peptides, or if the interaction is performed under conditions 
where conformation, but not specific interaction, is disrupted, 
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B. Comml ications 

Two obvious complications exist with the method of 
sequence analysis by hybridization. The first results from a 
probe of inappropriate length while the second relates to 
internally repeated sequences. 

The r f irst obvious complication is a problem which 
arises from an inappropriate length of recognition sequence, 
which causes problems with the specificity of recognition- For 
example,, if the recognized sequence is too short, every 
sequence" which is utilized will be recognized by every probe 
sequence. This occurs, e.g., in a binary system where the 
probes are each of sequences which occur relatively frequently, 
e.g. , a two character probe for the binary system. Each 
possible two character probe would be expected to appear k of 
the time in every single two character position. Thus, the 
above sequence example would be recognized by each of the 00, 
10, 01, and 11. Thus, the sequence information is virtually 
lost because the resolution is too low and each recognition 
reagent specifically binds at multiple sites on the target 
sequence. 

The number of different probes which bind to a target 
depends on the relationship between the probe length and the 
target length. At the extreme of short probe length, the just 
mentioned problem exists of excessive redundancy and lack of 
resolution. The lack of stability in recognition will also be 
a problem with extremely short probes. At the extreme of long 
probe length, each entire probe sequence is on a different 
position of a substrate. However, a problem arises from the 
number of possible sequences, which goes up dramatically with 
the length of the sequence. Also, the specificity of 
recognition begins to decrease as the contribution to binding 
by any particular subunit may become sufficiently low that the 
system fails to distinguish the fidelity of recognition. 
Mismatched hybridization may be a problem with the 
polynucleotide sequencing applications, though the 
fingerprinting and mapping applications may not be so strict in 
their fidelity requirements. As indicated above, a thirty 
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position binary sequence has over a "million possible sequences, 
a number which starts to become unreasonably large in its 
required number of different sequences, even though the target 
length is still very short. Preparing a substrate with all 
sequence -possibilities for a long target may be extremely 
difficult due to the many different oligomers which must be 
synthesized. 

The above example illustrates how a long target 
sequence may be reconstructed with a reasonably small number of 
shorter . subsequences . Since the present day resolution of the 
regions "of the substrate having defined oligomer probes 
attached to the substrate approaches about 10 microns by 10 
microns for resolvable regions , about 10 6 , or 1 million, 
positions can be placed on a one centimeter square substrate. 
However, high resolution systems may have particular 
disadvantages which may be outweighed using the lower density 
substrate matrix pattern. For this reason, a sufficiently 
large number of probe sequences can be utilized so that any 
given target sequence may be determined by hybridization to a 
relatively small number of probes. 

A second complication relates to convergence of 
sequences to a single subsequence . This will occur when a 
particular subsequence is repeated in the target sequence. 
This problem can be addressed in at least two different ways. 
The first, and simpler way, is to separate the repeat sequences 
onto two different targets. Thus, each single target will not 
have the repeated sequence and can be analyzed to its end. 
This solution, however, complicates the analysis by requiring 
that some means for cutting at a site between the repeats can 
be located. Typically a careful sequencer would want to have 
two intermediate cut points so that the intermediate region can 
also be sequenced in both directions across each of the cut 
points. This problem is inherent in the hybridization method 
for sequencing but can be minimized by using a longer known 
probe sequence so that the frequency of probe repeats is 
decreased. 

Knowing the sequence of flanking sequences of the 
repeat will simplify the use of polymerase chain reaction (PCR) 
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or a sitailar technique to further definitively determine the 
sequence between sequence repeats. Probes can be made to 
hybridize to those known sequences adjacent the repeat 
sequences r thereby producing new target sequences for analysis. 
see f e.g., Innis et al.. (eds.) (1990) PCR Protocols: A Guide 
to Methods and Applications , Academic Press; and methods for 
synthesis of oligonucleotide probes, see, e.g., Gait (1984) 
Oligonucleotide Synthesis; A Practical Approach , IRL Press, 
Oxford. 

- - Other means for dealing with convergence problems 
include using particular longer probes, and using degeneracy 
reducing analogues, see, e.g., Macevicz, S. (1990) PCT 
publication number WO 90/04652, which is hereby incorporated 
herein by reference. By use of stretches of the degeneracy 
reducing analogues with other probes in particular 
combinations, the number of probes necessary to fully saturate 
the possible oligomer probes is decreased. For example, with a 
stretch of 12-mers having the central 4-mer of degenerate 
nucleotides, in combination with all of the possible 8-mers, 
the collection numbers twice the number of possible 8-mers, 
e.g. 65,536 + 65,536 = 131,072, but the population provides 
screening equivalent to all possible 12-mers. - 

By way of further explanation, all possible 
oligonucleotide 8-mers may be depicted in the fashion: 

N1-N2-N3-N4-N5-N6-N7-N8 , 
in which there are 4 8 « 65,53 6 possible 8-mers. As described 

in U.S. S.N. / , , attorney docket number 11509-28 

(automated VLSIPS) , producing all possible 8-mers requires 
4x8 - 32 chemical binary synthesis steps to produce the entire 
matrix pattern of 65,53 6 8-mer possibilities. By incorporating 
degeneracy reducing nucleotides, D's,. which hybridize 
nonselectively to any corresponding complementary nucleotide, 
new oligonucleotides 12-mers can be made in the fashion: 

N1-N2-N3-N4-D-D-D-D-N5-N6-N7-N8 , 
in which there are again, as above, only 4 8 = 65,536 possible 
M 12-mers M , which in reality only have 8 different nucleotides. 

However, it can be seen that each possible 12-mer 
probe could be represented by a group of the two 8-mer types. 
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Moreover, repeats of less than 12 nucleotides would not 
converge, or cause repeat problems in the analysis. Thus, 
instead of requiring a collection of probes corresponding to 
all 12-mers, or 4 12 = 16,777,216 different 12-mers, the same 
5 information can be derived by making 2 sets of «8-mers« 

consisting of the typical 8-mer collection of 4 = 65,536 and 
the » 12-mer" set with the degeneracy reducing analogues, also 
requiring making 4 8 = 65,536. The combination of the two sets, 
requires making 65,536 + 65,536 = 131,072 different molecules, 
10 but giving the inf ormation'of 16,777,216 molecules. Thus, 

incorporating the degeneracy reducing analogue decreases the 
number of molecules necessary to get 12-mer resolution by a 
factor of about 128-fold. 

15 C. Wnn-Dolv"'"--! goti ^g Kmbodiroents 

The above example is directed towards a 
polynucleotide embodiment. This application is relatively 
easily achieved because the specific reagents will typically be 
complementary oligonucleotides, although in certain embodiments 
20 other specific reagents may be desired. For example, there may 
be circumstances where other than complementary base pairing 
will be utilized. The polynucleotide targets, will usually be 
single strand, but may be double or triple stranded in various 
applications. However, a triple stranded specific interaction 
2 5 might be sometimes desired, or a protein or other specific 
binding molecule may be utilized. For example, various 
promoter or DNA sequence specific binding proteins might be 
used, including, e.g., restriction enzyme binding domains, 
other binding domains, and antibodies. Thus, specific 
30 recognition reagents besides oligonucleotides may be utilized. 

For other polymer targets, the specific reagents will 
often be polypeptides. These polypeptides may be protein 
binding domains from enzymes or other proteins which display - 
specificity for binding. Usually an antibody molecule may be 
35 used, and monoclonal antibodies may be particularly desired. 

Classical methods may be applied for preparing antibodies, see, 
e.g., Harlow and Lane (1988) ftnUbodjeSJ — ft Laboratory Manua l 
.J Cold Spring Harbor Press, New York; and Coding (198 6) 
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Monoclonal Antibodies: Principles and Practice (2d Ed,) 
Academic Press, San Diego. other suitable techniques for in 
vitro exposure of lymphocytes to the antigens or selection of 
libraries of antibody binding sites are described, e.g., in 
Huse et al. (1989) Science 246:1275-1281; and Ward et al. 
91989) Nature 341:544-546, each of which is hereby incorporated 
herein by reference. Unusual antibody production methods are 
also described, e.g., in Hendricks et al. (1989) BioTechnoloav 
7:1271-1274; and Hiatt et al. (1989) Nature 342:76-78, each of 
which .is hereby incorporated herein by reference. Other 
molecules which may exhibit specific binding interaction may be 
useful for attachment to a VLSIPS substrate by various methods , 
including the caged biotin methods, see, e.. g. , U.S. S.N. 
07/435,316 (caged biotin parent), and U.S. S.N. 07/612,671 
(caged biotin CIP) . 

The antibody specific reagents should be particularly 
useful for the polypeptide, carbohydrate, and synthetic polymer 
applications. Individual specific reagents might be generated 
by an automated process to generate the number of reagents 
necessary to advantageously use the high density positional 
matrix pattern. In an alternative approach, a plurality of 
hybridoma cells may be screened for their ability to bind to a 
VLSIPS matrix possessing the desired sequences whose binding 
specificity is desired. Each cell might be individually grown 
up and its binding specificity determined by VLSIPS apparatus 
and technology. An alternative strategy would be to expose the 
same VLSIPS matrix to a polyclonal serum of high titer. By a 
successively large volume of serum and different animals, each 
region of the VLSIPS substrate would- have attached to it a 
substantial number of antibody molecules with specificity of 
binding. The substrate, with non-covalently bound antibodies 
could be derivatized and the antibodies transferred to an 
adjacent second substrate in the matrix pattern in which the 
antibody molecules had attached to the first matrix. If the 
sensitivity of detection of binding interaction is sufficiently 
high, such a low efficiency transfer of antibody molecules may 
produce a sufficiently high signal to be useful for many 
purposes, including the sequencing applications. 
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In another embodiment, capillary forces may be used 
to transfer the selected reagents to a new matrix, to which the 
reagents would be positionally attached in the pattern of the . 
recognized sequences. 0r f the reagents could be transversely 
5 electrophoresed, magnetically transferred, or otherwise 

transported to a new substrate in their retained positional 
pattern. ^ r 

III. POLYNUCLEOTIDE SEQUENCING 

10 In principle, the making of a substrate having a 

positionally defined matrix pattern of all possible., 
oligonucleotides of a given length involves a conceptually 
simple method of synthesizing each and every different possible 
oligonucleotide, and affixed to a definable position. 

15 Oligonucleotide synthesis is presently mechanized and enabled 
by current technology, see, e.g., U.S. S.N. 07/362,901 (VLSIPS 
parent); U.S. S.N. 07/492,462 (VLSIPS CIP) ; and instruments 
supplied by Applied Biosystems, Foster City, California. 

2 0 A. Preparation of substrate Matrix 

The production of the collection of specific 
oligonucleotides used in polynucleotide sequencing may be 
produced in at least two different ways. Present technology 
certainly allows production of ten nucleotide oligomers on a 

25 solid phase or other synthesizing system. See, e.g., 

instrumentation provided by Applied Biosystems, Foster City, 
California. Although a single oligonucleotide can be 
relatively easily made, a large collection of them would 
typically require a fairly large amount of time and investment. 

30 For example, there are 4 10 = 1,048,576 possible ten nucleotide 
oligomers. Present technology allows making each and every one 
of them in a separate purified form though such might be costly 
and laborious. 

Once the desired repertoire of possible oligomer 

3 5 sequences of a given length have been synthesized, this 

collection of reagents may be individually positionally 
attached to a substrate, thereby. allowing a batchwise 
hybridization step- Present technology also would allow the 



possibility of attaching each and every one of these 10-mers to 
a separate specific position on a solid matrix. This 
attachment could be automated in any of a number of ways, 
particularly use of a caged biotin type linking. This would 
produce a matrix having each of different possible 10-mers. 

A batchwise hybridization is much preferred because 
of its reproducibility and simplicity. An automated process of 
attaching various reagents to positionally defined sites on a 
substrate is provided in U.S. S.N. 07/492,462 (VLSIPS CIP) ; 

U.S. S.N. / , , attorney docket number 11509-28 (automated 

VLSIPS) ; -and U.S. S.N. 07/612,671 (caged biotin CIP) t each of 
which is hereby incorporated herein by reference. 

Instead of separate synthesis of each 
oligonucleotide, these oligonucleotides are conveniently 
synthesized in parallel by sequential synthetic processes on a 
defined matrix pattern as provided in U.S. S.N. 07/492,462 

(VLSIPS CIP); and U.S. S.N. / , , attorney docket number 

11509-28 (automated VLSIPS) , which are incorporated herein by 
reference. Here, the oligonucleotides are synthesized stepwise 
on a substrate at positionally separate and defined positions. 
Use of photosensitive blocking reagents allows for defined 
sequences of synthetic steps over the surface of a matrix 
pattern. By use of the binary masking strategy, the surface of 
the substrate can be positioned to generate a desired pattern 
of regions, each having a defined sequence oligonucleotide 
synthesized and immobilized thereto. 

Although the prior art technology can be used to 
generate the desired repertoire of oligonucleotide probes, an 
efficient and cost effective means would be to use the VLSIPS 
technology described in U.S. S.N. 07/492,462 (VLSIPS CIP) and 

U.S. S.N, / , , attorney docket number 11509-28 (automated 

VLSIPS) . In this embodiment, the photosensitive reagents 
involved in the production of such a matrix are described 
below. 

The regions for synthesis may be very small, usually 
less than about 100 jra x 100 ^m, more usually less than about 
50 /im x 50 /im. The photolithography technology allows 
synthetic regions of less than about 10 jum x 10 Mm, about 
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3 x 3 fjim f or less. The detection- also may detect such sized 
regions, though larger areas are more easily and reliably 
measured* 

At a size of about 3 0 microns by 3 0 microns, one 
million regions would take about 11 centimeters square or a 
single wafer of about 4 centimeters by 4 centimeters. Thus the 
present technology provides for making a single matrix of that 
size having all one million plus possible oligonucleotides • 
Region size are sufficiently small to correspond to densities 
of at least about 5 regions/cm 2 , 20 regions/cm 2 , 50 

regions/cm 2 , 100 regions/cm 2 , and greater, including 3 00 

9 . 2 » 2 o 

regions/ cm , 1000 regions/cm , 3K regions/cm , 10K regions/cm 42 $ 

30K regions/cm 2 , 100K regions/cm 2 , 300K regions/ cm 2 or more, 

even in excess' of one million regions/cm . 

Although the pattern of the regions which contain 

specific sequences is theoretically not important, for 

practical reasons certain patterns will be preferred in 

synthesizing the oligonucleotides. The application of binary 

masking algorithms for generating the pattern of known 

oligonucleotide probes is described in related U. S.S.N. 

/ , , attorney docket number 11509-28 (automated VLSIPS) 

which was filed simultaneously with this application. By use 

of these binary masks; a highly efficient means is provided for 

producing the substrate with the desired matrix pattern of 

different sequences. Although the binary masking strategy 

allows for the synthesis of all lengths of polymers, the 

strategy may be easily modified to provide only polymers of a 

given length. This is achieved by omitting steps where a 

subunit is not attached. 

The strategy for generating a specific pattern may 

take any of a number of different approaches. These approaches 

are well described in related application U.S. S.N. / , , 

attorney docket number 11509-28 (automated VLSIPS) and include 

a number of binary masking approaches which will not be 

exhaustively discussed herein. However, the binary masking and 

binary synthesis approaches provide a maximum of diversity with 

a minimum number of actual synthetic steps. 
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The length of oligonucleotides used in sequencing 
applications will be selected on criteria determined to some 
extent by the practical limits discussed above. For example, 
if probes are made as oligonucleotides, there will be 65,536 
possible eight nucleotide sequences. If a nine subunit 
oligonucleotide is selected, there are 262,144 possible 
permeations of" sequences. If a ten-mer oligonucleotide is 
selected, there are 1,048,576 possible permeations of 
sequences.- As the number gets larger, the required number of 
positionally defined subunits necessary to saturate the 
possibilities also increases. With respect to hybridization 
conditions, the length of the matching necessary to ^converse 
stability of the conditions selected can be compensated for. 
See, e.g., Kanehisa, M. (1984) Nuc> Acids Res. 12:203-213, 
which is hereby incorporated herein by reference. 

Although not described in detail here, but below for 
oligonucleotide probes, the VLSIPS technology would typically 
use a photosensitive protective group on an oligonucleotide.. 
Sample oligonucleotides are shown in Figure 1. In particular, 
the photoprotective group on the nucleotide molecules may be 
selected from a wide variety of positive light reactive groups 
preferably including nitro aromatic compounds such as o-nitro- 
benzyl derivatives or benzylsulf onyl . See, e.g.. Gait (1984) 
Oligonucleotide Synthesis: A Practical Approach f IRL Press, 
Oxford, which is hereby incorporated herein by reference. In a 
preferred embodiment, 6-nitro-veratryl oxycarbony (NVOC) , 2- 
nitrobenzyl oxycarbonyl (NBOC) , or a , a-dimethyl-dimethoxybenzyl 
oxycarbonyl (DEZ ) is used. Photoremovable protective groups 
are described in, e.g., Patchornik (197 0) J. Amer. Chem. Soc. 

92:6333- ; and Am it et al. (1974) J. Organic Chem. 39:192- 

; each of which is hereby incorporated herein by reference. 

A preferred linker for attaching the oligonucleotide 
to a silicon matrix is illustrated in Figure 2- A more 
detailed description is provided below. A photosensitive 
blocked nucleotide may be attached to specific locations of 
unblocked prior cycles of attachments on the substrate and can 
be successively built up to the correct length oligonucleotide 
probe • 
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It should be noted that multiple substrates may be 
simultaneously exposed to a single target sequence where each 
substrate is a duplicate of one another or where, in 
combination, multiple substrates together provide the complete 
or desired subset of possible subsequences. This provides the 
opportunity to overcome a limitation of the density of 
positions on a" single substrate by* using multiple substrates. 
In the extreme case, each probe might be attached to a single 
bead or substrate and the beads sorted by whether there is a 
binding interaction* Those beads which do bind might be 
encoded to indicate the subsequence specificity of reagents 
attached thereto. 

Then, the target may be bound to the whole collection 
of beads and those beads that have appropriate specific 
reagents on them will bind to target* Then a sorting system 
may be utilized to sort those beads that actually bind the 
target from those that do not. This may be accomplished by 
presently available cell sorting devices or a similar 
apparatus. After the relatively small number of beads which 
have bound the target have been collected, the encoding scheme 
may be read off to determine the specificity of the reagent on 
the bead. An encoding system may include a magnetic system, a 
shape encoding system, a color encoding system, or a 
combination of any of these, or any other encoding system. 
Once again, with the collection of specific interactions that * 
have occurred, the binding may be analyzed for sequence 
information, fingerprint information, or mapping information. 
The parameters of polynucleotide sizes of both the 
probes and target sequences are determined by the applications 
and other circumstances. The length of the oligonucleotide 
probes used will depend in part upon the limitations of the 
VLSIPS technology to provide the number of desired probes. For 
example, in an absolute sequencing application, it is often 
useful to have virtually all of the possible oligonucleotides 
of a given length. As indicated above, there are 65,53 6 8- 
mers, 262,144 9-mers, 1,048,576 10-mers, 4,194,304 11-mers, 
etc* As the length of the oligomer increases the number of 
different probes which must be synthesized also increases at a 
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rate of a factor of 4 for every additional nucleotide. 
Eventually the size of the matrix and the limitations in the 
resolution of regions in the matrix will reach the point where 
an increase in number of probes becomes disadvantageous* 
5 However, this sequencing procedure requires that the system be 
able to distinguish , by appropriate selection of hybridization 
and washing conditions, between binding of absolute fidelity 
and binding of complementary sequences containing mismatches* 
On the other hand, if the fidelity is unnecessary, this 
10 discrimination is also unnecessary and a significantly longer 

probe may,. be used. Significantly longer probes would typically 
be useful in fingerprinting or mapping applications 

The length of the probe is selected for a length that 
it will bind with specificity to- possible targets. The 
15 hybridization conditions are also very important in that they 
will determine how close the homology of complementary binding 
will be detected. In fact, a single target may be evaluated at 
a number of different conditions to determine its spectrum of 
specificity for binding particular probes. This may find use 
2 0 in a number of other applications besides the polynucleotide 

sequencing fingerprinting or mapping. For example, it will be 
desired to determine the spectrum of binding affinities and 
specificities of cell surface antigens with binding by 
particular antibodies immobilized on the substrate surface, 
25 particularly under different interaction conditions. In a 
related fashion, different regions with reagents having 
differing affinities or levels of .specificity may allow such a 
spectrum to be defined using a single incubation, where various 
regions, at a given hybridization condition, show the binding 
30 affinity. For example, fingerprint probes of various lengths, 
or with specific defined non-matches may be used. Unnatural 
nucleotides or nucleotides exhibiting modified specificity of 
complementary binding are described in greater detail in 
Macevicz (1990) PCT pub. No. WO 90/04652; and see the section 
35 on modified nucleotides in the Sigma Chemical Company 
catalogue. 
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B* Labeling Target Nucleotide 

The label used to detect the target sequences will be 
determined, in part, by the detection methods being applied. 
Thus, the labeling method and label used are selected in 
combination with the actual detecting systems being used. 

Once a particular label has been selected, 
appropriate labeling protocols' will be applied, as described 
below for specific embodiments. Standard labeling protocols 
for nucleic acids are described, e.g., in Sambrook et al.; 
Kambara, H. et al. (1988) BioTechnology 6:816-821; Smith, L. et 
al. (1985) Nuc. Acids Res. 13:2399-2412; for polypeptides, see, 
e.g., Allen G. (1989) Sequencing of Proteins and Pent ides , 
Elsevier, New York, especially chapter 5, and Greenstein and 
Winitz (1961) Chemistry of the Amino Acids , Wiley and Sons, Hew 
York. Carbohydrate labeling is described, e.g., in Chaplin and 
Kennedy (1986) Carbohydrate Analysis: A Practical Approach f 
IRL Press, Oxford. Labeling of other polymers will be 
performed by methods applicable to them as recognized by a 
person having ordinary skill in manipulating the corresponding 
polymer. 

In some embodiments, the target need not actually be 
labeled if a means for detecting where interaction takes place 
is available. As described below, for a nucleic acid 
embodiment, such may be provided by an intercalating dye which 
intercalates only into double stranded segments,* e.g., where 
interaction occurs. See, e.g., Sheldon et al. U.S. Pat. No. 
4,582,789. 

In many uses, the target sequence will be absolutely 
homogeneous, both with respect to the total sequence and with 
respect to the ends of each molecule. Homogeneity with respect 
to sequence is important to avoid ambiguity. It is preferable 
that the target sequences of interest not be contaminated with 
a significant amount of labeled contaminating sequences. The 
extent of allowable contamination will depend on the 
sensitivity of the detection system and the inherent signal to 
noise of the system. Homogeneous contamination sequences will 
be particularly disruptive of the sequencing procedure. 
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However, although the target polynucleotide must have 
a unique sequence, the target molecules need not have identical 
ends. In fact, the homogeneous target molecule preparation may 
be randomly sheared to increase the numerical number of 
molecules. Since the total information content remains the 
same, the shearing results only in a higher number of distinct 
sequences which, may be labeled and bind to the probe. This 
fragmentation may give a vastly superior signal relative to a 
preparation of the target molecules having homogeneous ends. 
The signal for the hybridization is likely to be dependent on 
the numerical frequency of the target-probe interactions. If a 
sequence is individually found on a larger number o£ separate 
molecules a better signal will result. In fact, shearing a 
homogeneous preparation of the target may often be preferred 
before the labeling procedure is performed, thereby producing a 
large number of labeling groups associated with each 
subsequence . 

c - Hybridization Conditions 

The hybridization conditions between probe and target 
should be selected such that the specific recognition 
interaction, i.e., hybridization, of the two molecules is both 
sufficiently specific and sufficiently stable. See, e.g., 
Hames and Higgins (1985) Nucleic Acid Hybridisation: A 
Practical Approach . IRL Press, Oxford. These conditions will 
be dependent both on the specific sequence and often on the 
guanine and cytosine (GC) content of the complementary hybrid 
strands. The conditions may often be selected to be. 
universally equally stable independent of the specific 
sequences involved. This typically will make use of a reagent 
such as an arylammonium buffer. See, Wood et al. (1985) "Base 
Composition-independent Hybridization in Tetramethylammonium 
Chloride: A Method for Oligonucleotide Screening of Highly 
Complex Gene Libraries, 11 Proc. Natl. Acad, Sci. us A , 82:1585- 
1588; and Krupov et al. (1989) "An Oligonucleotide 
Hybridization Approach to DNA Sequencing," FEBS Letters . 
256:118-122; each of which is hereby incorporated herein by 
reference. An arylammonium buffer tends to minimize 
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differences in hybridization rate and stability due to gc 
content * By virtue of the fact that sequences then hybridize 
with approximately equal affinity and stability, there is 
relatively little bias in strength or kinetics of binding for 
particular sequences. Temperature and salt conditions along 
with other buffer parameters should be selected such that the 
kinetics of renajturation should be essentially independent of 
the specific target subsequence or oligonucleotide probe 
involved. In order to ensure this, the hybridization reactions 
will usually be performed in a single incubation of all the 
substrate* matrices together exposed to the identical* same 
target probe solution under the same conditions. 

Alternatively, various substrates may be individually 
treated differently. Different substrates may be produced, 
each having reagents which bind to target subsequences with 
substantially identical stabilities and kinetics of 
hybridization. For example, all of the high GC content probes 
could be synthesized on a single substrate which is treated 
accordingly. In this embodiment, the arylammonium buffers 
could be unnecessary. Each substrate is then treated in a 
manner that the collection of substrates show essentially 
uniform binding and the hybridization data of target binding to 
the individual substrate matrix is. combined with the data from 
other substrates to derive the necessary subsequence binding 
information. The hybridization conditions will usually be 
selected to be sufficiently specific that the fidelity of base 
matching will be properly discriminated. Of course, control 
hybridizations should be included to determine the stringency 
and kinetics of hybridization. 

D. Detection; VLSIPS Scanning 
The next step of the sequencing process by 
hybridization involves labeling of target polynucleotide 
molecules. A quickly and easily detectable signal is . 
preferred. The VLSIPS apparatus is designed to easily detect a 
fluorescent label, so fluorescent tagging of the target 
sequence is preferred. Other suitable labels include heavy 
metal labels, magnetic probes, chromogenic labels (e.g., 
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phosphorescent labels, dyes, and f luorophores) spectroscopic 
labels, enzyme linked labels, radioactive labels, and labeled 
binding proteins. Additional labels are described in U.S. Pat. 
No. 4,366,241, which is incorporated herein by reference. 

.The detection methods used to determine where 
hybridization has taken place will typically depend upon the 
label selected" above. Thus, for a fluorescent label a 

r 

fluorescent detection step will typically be used. U.S. S.N. 

07/492,462 (V1SIPS CIP) and U.S. S.N. / , , attorney 

docket number 11509-28 (automated VLSIPS) describe apparatus 
and mechanisms for scanning a substrate matrix using 
fluorescence detection, but a similar apparatus is adaptable 
for other optically detectable labels. 

The detection method provides a positional 
localization of the region where hybridization has taken place. 
However, the position is correlated with the specific sequence 
of the probe since the probe has specifically been attached or 
synthesized at a defined substrate matrix position. Having 
collected all of the data indicating the subsequences present 
in the target sequence, this data may be aligned by overlap to 
reconstruct the entire sequence of the target, as illustrated 
above. 

It is also possible to dispense with actual labeling 
if some means for detecting the positions of interaction 
between the sequence specific reagent and the target molecule 
are available. This may take the form of an additional reagent 
which can indicate the sites either of interaction, or the 
sites of lack of interaction, e.g., a negative label. For the 
nucleic acid embodiments, locations of double strand 
interaction may be detected by the incorporation of 
intercalating dyes, or other reagents such as antibody or other 
reagents that recognize helix formation, see, e.g., Sheldon, et 
al. (1986) U.S. Pat. No. 4,582,789, which is hereby 
incorporated herein by reference. 

E. Analysis 

Although the reconstruction can be performed manually 
as illustrated above, a computer program will typically be used 
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to perform the overlap analysis. A "program may be written and 
run on any of a large number of different computer hardware 
systems. The variety of operating systems and languages 
useable will be recognized by a computer software engineer. 
Various different languages may be used, e.g. , BASIC; C; 
PASCAL.; etc. A simple flow chart of data analysis is 
illustrated in" Figure 4 . 

F» Substrate Reuse 

Finally, after a particular sequence has been 
hybridized and the pattern of hybridization analyzed, the 
matrix substrate should be reusable and readily prepared for 
exposure to a second or subsequent target polynucleotides. In 
order to do so, the hybrid duplexes are disrupted and the 
matrix treated in a way which removes all traces of the 
original target. The matrix may be treated with various 
detergents or solvents to which the substrate, the 
oligonucleotide probes, and the linkages to the substrate are 
inert. This treatment may include an elevated temperature 
treatment, treatment with organic or inorganic solvents, 
modifications in pH, and other means for disrupting specific 
interaction. Thereafter, a second target may actually be 
applied to the recycled matrix and analyzed as before. 

G. Non-Polvnucleotide Aspects 

Although the sequencing, fingerprinting, and mapping 
functions will make use of the natural sequence recognition 
property of complementary nucleotide sequences, the non- 
polynucleotide sequences typically require other sequence 
recognition reagents. These reagents will take the form, 
typically, of proteins exhibiting binding specificity, e.g. , 
enzyme binding sites or antibody binding sites. 

Enzyme binding sites may be derived from promoter 
proteins, restriction enzymes, and the like. See, e.g., 
Stryer, L. (1988) Biochemistry , W.H. Freeman, Palo Alto. 
Antibodies will typically be produced using standard 
procedures, see, e.g., Harlow and Lane (1988) Antibodies; h 
Laboratory Manual , Cold Spring Harbor Press, New York; and 
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Goding (1986) Monoclonal Antibodies : ~. Principles and Practice , 
(2d Ed.) Academic Press, San Diego. 

Typically, an antigen, or collection of antigens are 
presented to an immune system. This may take the form of 
synthesized short polymers produced by the VLSIPS technology, 
or by the other synthetic means, or from isolation of natural 
products. For "example , antigen for the polypeptides may be 
made by the VLSIPS technology, by standard peptide synthesis, 
by isolation of natural proteins with or without degradation to 
shorter^ segments, or by expression of a collection of short 
nucleic acids of random or defined sequences. See, e.g. , Tuerk 
and Gold (1990) Science 249:505-510, for generation of a 
collection of randomly mutageni2ed oligonucleotides useful for 
expression. 

The antigen or collection is presented to an 
appropriate immune system, e.g., to a whole animal as in a 
standard immunization protocol, or to a collection of immune 
cells or equivalent. In particular, see Ward et al. (19 89) 
Nature 341:544-546; and Huse et al. (1989) Science 246:1275- 
1281, each of which is hereby incorporated herein by reference. 

A large diversity of antibodies will be generated, 
some of which have specificities for the desired sequences. 
Antibodies may be purified having the desired sequence 
specificities by isolating the cells producing them. For 
example, a VLSIPS substrate with the desired antigens 
synthesized thereon may be used to isolate cells with cell 
surface reagents which recognize the antigens • The VLSIPS 
substrate may be used as an affinity reagent to select and 
recover the appropriate cells. Antibodies from those cells may 
be attached to a substrate using the caged biotin methodology, 
or by attaching a targeting molecule, e.g., an oligonucleotide. 
Alternatively, the supernatants from antibody producing cells 
can be easily assayed using a VLSIPS substrate to identify the 
cells producing the appropriate antibodies. 

Although cells may be isolated, specific antibody 
molecules which perform the sequence recognition will also be 
sufficient. Preferably populations of antibody with a known 
specificity can be isolated. Supernatants from a large 
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population of producing cells may be- passed over a VLSIPS 
substrate to bind to the desired antigens attached to the 
substrate. When a sufficient density of antibody molecules are 
attached, they may be removed by an automated process, 
preferably as antibody populations exhibiting specificity of 
binding • 

In one particular embodiment, a VLSIPS substrate, 
e.g., with a large plurality of fingerprint antigens attached 
thereto, is used to isolate antibodies from a supernatant of a 
population of cells producing antibodies to the antigens . 
Using the substrate as an affinity reagent, the antibodies will 
attach to the appropriate positionally defined antigens. The 
antibodies may be carefully removed therefrom, preferably by an 
automated system which retains their homogeneous specificities. 
The isolated antibodies can be attached to a new substrate in a 
positionally defined matrix pattern. 

In a further embodiment, these spatially separated 
antibodies may be isolated using a specific targeting method 
for isolation. In this embodiment, a linker molecule which 
attaches to a particular portion of the antibody, preferably 
away from the binding site, can be attached to the antibodies. 
Various reagents will be used, including staphylococcus protein 
A or antibodies which bind to domains remote from the binding 
site. Alternatively, the antibodies in the population, before 
affinity purification, may be derivatized with an appropriate 
reagent compatible with new VLSIPS synthesis. A preferred 
reagent is a nucleotide which can serve as a linker to 
synthetic VLSIPS steps for synthesizing a specific sequence 
thereon. Then, by successive VLSIPS cycles, each of the 
antibodies attached to the defined antigen regions can have a 
defined oligonucleotide synthesized thereon and corresponding 
in area to the region of the substrate having each antigen 
attached. These defined oligonucleotides will be useful as 
targeting reagents to attach those antibodies possessing the 
same target sequence specificity at defined positions on a new 
substrate, by virtue of having bound to the antigen region, to 
a new VLSIPS substrate having the complementary target 
oligonucleotides positionally located on it. In this fashion, 
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a VLSIPS substrate having the desired antigens attached thereto 
can be used to generate a second .VLSIPS substrate with 
positionally defined reagents which recognize those antigens. 

The selected antigens will typically be selected to 
be those which define particular functionalities or properties, 
so as to be useful for fingerprinting and other uses. They 
will also be useful for mapping and sequencing embodiments. 

IV . FINGERPRINTING 

a. General 

- Many of the procedures and techniques used in the 
polynucleotide sequencing section are also appropriate for 
fingerprinting applications* See, e.g., Poustka, et al. (1986) 
Cold Spring Harbor Symposia on Quant. Biol, , vol. LI, 131-13 9, 
Cold Spring Harbor Press, New York; which is hereby 
incorporated herein by reference. The fingerprinting method 
provided herein is based, in part, upon the ability to 
positionally localize a large number of different specific 
probes onto a single substrate. This high density matrix 
pattern provides the ability to screen for, or detect, a very 
large number of different sequences simultaneously. In fact, 
depending upon the hybridization conditions, fingerprinting to 
the resolution of virtually absolute matching of sequence is 
possible thereby approaching an absolute sequencing embodiment. 
And the sequencing embodiment is very useful in identifying the 
probes useful in further fingerprinting uses. For example, 
characteristic features of genetic sequences will be identified 
as being diagnostic of the entire sequence. However, in most 
embodiments, longer probe and target will be used, and for 
which slight mismatching may not need to be resolved. 

Preparation of Substrate Matrix 
A collection of specific probes may be produced by 
either of the methods described above in the section on 
sequencing. Specific oligonucleotide probes of desired lengths 
may be individually synthesized on a standard oligonucleotide 
synthesizer. The length of these probes is limited only by the 
length of the ability of the synthesizer to continue to 
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accurately synthesize a molecule* "Oligonucleotides or sequence 
fragments may also be isolated from natural sources. 
Biological amplification methods may be coupled with synthetic 
synthesizing procedures such as, e.g., polymerase chain 
reaction. 

In one embodiment,* the individually isolated probes 
may be attached, to the matrix at defined positions. These 
probe reagents may be attached by an automated process making 
use of the caged biotin methodology described in U.S. S.N. 
07/612,671 (caged biotin CIP) , or using photochemical reagents, 
see, e.g., Dattagupta et al. (1985) U.S. Pat. No. A, 542, 102 and 
(19 87) U.S. Pat. No. 4,713,326. Each individual purified 
reagent can be attached individually at specific locations on a 
substrate. 

In another embodiment, the VLSIPS synthesizing 
technique may be used to synthesize the desired probes at 
specific positions on a substrate. The probes may be 
synthesized by successively adding appropriate monomer 
subunits, e.g., nucleotides, to generate the desired sequences. 

In another embodiment, a relatively short specific 
oligonucleotide is used which serves as a targeting reagent for 
positionally directing the sequence recognition reagent. For 
example, the sequence specific reagents having a separate 
additional sGquenoB recognition segment (usually of a different 
polymer from the target sequence) can be directed to target 
oligonucleotides attached to the substrate. By use of non- 
natural targeting reagents, e.g., unusual nucleotide analogues 
which pair with other unnatural nucleotide analogues and which 
do not interfere with natural nucleotide interactions, the 
natural and non-natural portions can coexist on the same 
molecule without interfering with their individual 
functionalities. This can combine both a synthetic and 
biological production system analogous to the technique for 
targeting monoclonal antibodies to locations on a VLSIPS 
substrate at defined positions. Unnatural optical isomers of 
nucleotides may be useful unnatural reagents subject to similar 
chemistry, but incapable of interfering with the natural 
biological polymers. See also, U.S. S.N. / , , attorney 
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docket number 11509-26 (sequencing by synthesis) ; which is 
hereby incorporated herein by reference. 

After the separate substrate attached reagents are 
attached to the targeting segment, the two are crossl inked, 
thereby permanently attaching them to the substrate. Suitable 
crosslinking reagents are known, see, e.g., Dattagupta et al. 
(1985) U.S. Pat. 9 No. 4,542,102 and (1987) "Coupling of nucleic 
acids to solid support by photochemical methods," U.S. Pat. No. 
4,713,326, each of which is hereby incorporated herein by 
reference. Similar linkages for attachment of proteins to a 
solid substrate are provided, e.g., in Merrifield (1986) 

Science 232:341- , which is hereby incorporated herein by 

reference. 

C. Labeling Target Nucleotides 
The labeling procedures used in the sequencing 
embodiments will also be applicable in the fingerprinting 
embodiments. However, since the fingerprinting embodiments, 
often will involve relatively large target molecules and 
relatively short oligonucleotide probes, the amount of. signal 
necessary to incorporate into the target sequence may be less 
critical than in the sequencing applications. For example, a 
relatively long target with a relatively small number of labels 
per molecule may be easily amplified or detected because of the 
relatively large target molecule size. 

In various embodiments, it may be desired to cleave 
the target into smaller segments as in the sequencing 
embodiments. The labeling procedures and cleavage techniques 
described in the sequencing embodiments would usually also be 
applicable here. 

D. Hybridization Conditions 

The hybridization conditions used in fingerprinting 
embodiments will typically be less critical than for the 
sequencing embodiments. The reason is that the amount of 
mismatching which may be useful in providing the fingerprinting 
information would typically be far greater than that necessary 
in sequencing uses. For example. Southern hybridizations do 
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not typically distinguish between slightly mismatched 
sequences. Under these circumstances, important and valuable 
information may be arrived at with less stringent hybridization 
conditions while providing valuable fingerprinting information. 
However, since the entire substrate is typically exposed to the 
target molecule at one time, the binding affinity of the probes 
should usually < r be of approximately ' comparable levels. For this 
reason, if oligonucleotide probes are being used, their lengths 
should be approximately comparable and will be selected to 
hybridize under conditions which are common for most of the 
probes on the substrate. Much as in a Southern hybridization, 
the target and oligonucleotide probes are. of lengths typically 
greater than about 25 nucleotides. Under appropriate 
hybridization conditions, e.g., typically higher salt and lower 
temperature, the probes will hybridize irrespective of 
imperfect complementarity. In fact f with probes of greater 
than, e.g., about fifty nucleotides, the difference in 
stability of different sized probes will be relatively minor. 

Typically the fingerprinting is merely for probing 
similarity or homology. Thus, the stringency of hybridization 
can usually be decreased to fairly low levels. See, e.g., 
Wetmur and Davidson (.1968) "Kinetics of Renaturation of DNA, 11 
J. Mol . Biol . ■ 31:349-370? and Kanehisa, M. (1984) Nuc. Acids 
Res. . 12:203-213. 

£• Detection r VL5IPS Scanning 
Detection methods will be selected which are 
appropriate for the selected label. The scanning device need 
not necessarily be digitized or placed into a specific digital 
database, though such would most likely be done. For example, 
the analysis in fingerprinting could be photographic. Where a 
standardized fingerprint substrate matrix is used, the pattern 
of hybridizations may be spatially unique and may be compared 
photographically. In this manner, each sample may have a 
characteristic pattern of interactions and the likelihood of 
identical patterns will preferably be such low frequency that 
the fingerprint pattern indeed becomes a characteristic pattern 
virtually as unique as an individual's fingertip fingerprint. 
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With a standardized substrate, every- individual could be, in 
theory, uniquely identifiable on the basis of the pattern of 
hybridizing to the substrate. 

Of course, the VLSIPS scanning apparatus may also be 
useful to generate a digitized version of the fingerprint 
pattern* In this way, the identification pattern can be 
provided in a ""linear string of digits. This sequence could 
also be used for a standardized identification system providing 
significant useful medical transferability of specific data. 
In one_embodiment , the probes used are selected to be of 
sufficiently high resolution to measure the antigens of the 
major histo compatibility complex, it might even be possible to 
provide transplantation matching data in a linear stream of 
data. The fingerprinting data may provide a condensed version, 
or summary, of the linear genetic- data, or any other 
information data base. 

F. Analysis 

The analysis of the fingerprint will often be much 
simpler than a total sequence determination. However, there 
may be particular types of analysis which will be substantially 
simplified by a selected group of probes. For example, probes 
which exhibit particular populational heterogeneity may be 
selected. In this way, analysis may be simplified and 
practical utility enhanced merely by careful selection of the 
specific probes and a careful matrix layout of those probes. 

G. Substrate Reuse 

As with the sequencing appl-ication, the 
fingerprinting usages may also take advantage of the 
reusability of the substrate. In this way, the interactions 
can be disrupted, the substrate treated, and the renewed 
substrate is equivalent to an unused substrate. 

H. Non-polvnucleotide Aspects 
Besides polynucleotide applications, the 

fingerprinting analysis may be applied to other polymers, 
especially polypeptides, carbohydrates, and other polymers, 
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both organic and inorganic. Besides- using the fingerprinting 
method for analyzing a particular polymer, the fingerprinting 
method may be used to characterize various samples . For 
example, a cell or population of cells may be tested for their 
expression of specific antigens or their mRNA sequence intent. 
For example f a T-cell may be classified by virtue of its 
combination of expressed surface antigens. With specific 
reagents which interact with these antigens, a cell or a 
population of cells or a lysed cell may be exposed to a VLSIPS 
substrate. The biological sample may be classified or 
characterized by analyzing the pattern of specific Interaction. 
This may be applicable to a cell or tissue type, to. the 
expressed messenger KNA population expressed by a cell to the 
genetic content of a cell, or to virtually any sample which can 
be classified and/or identified by its combination of specific 
molecular properties. 

The ability to generate a high density means for 
screening the presence or absence of specific interactions 
allows for the possibility of screening for, if not saturating, 
all of a very large number of possible interactions. This is 
very powerful in providing the means for testing the 
combinations of molecular properties which can define a class 
of samples. For example, a species of organism may be 
characterized by its DNA sequences, e.g., a genetic 
fingerprint. By using a fingerprinting method, it may be 
determined that all members of that species are sufficiently 
similar in specific sequences that they can be easily 
identified as being within a particular group* Thus, newly 
defined classes may be resolved by their similarity in 
fingerprint patterns. Alternatively, a non-member of that 
group will fail to share those many identifying 

characteristics. However, since the technology allows testing 
of a very large number of specific interactions, it also 
provides the ability to more finely distinguish between closely 
related different cells or samples. This will have important 
applications in diagnosing viral, bacterial, and other 
pathological on nonpathological infections. 
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In particular, cell classification may be defined by 
any of a number of different properties. For example, a cell 
class may be defined by its DNA sequences contained therein. 
This allows species identification for parasitic or other 
infections. For example , the human cell is presumably 
genetically distinguishable from a monkey cell, but different 
human cells will share many genetic markers* At higher 

r 

resolution, each individual human genome will exhibit unique 
sequences that can define it as a single individual. 

Likewise, a developmental stage of a cell type may be 
definable by its pattern of expression of messenger.. RNA. For 
example, in particular stages of cells, high levels of 
ribosomal RNA are found whereas relatively low levels of other 
types of messenger RNAs may be found. The high- resolution 
distinguishability provided by this fingerprinting method 
allows the distinction between cells which have relatively 
minor differences in its expressed mRNA population. Where a 
pattern is shown to be characteristic of a stage, a stage may 
be defined by that particular pattern of messenger RNA 
expression. 

In a similar manner, the antigenic determinants found 
on a protein may very well define the cell class. For example, 
immunological T-cells are distinguishable from B-cells because, 
in part, the cell surface antigens on the cell types are 
distinguishable. Different T-cell subclasses can be also 
distinguished from one another by whether they contain 
particular T-cell antigens. The present invention provides the 
possibility for high resolution testing of many different 
interactions simultaneously, and the definition of new cell 
types will be possible. 

The high resolution VLSIPS substrate may also be used 
as a very powerful diagnostic tool to test the combination of 
presence, of a plurality of different assays from a biological 
sample. For example, a cancerous condition may be indicated by 
a combination of various different properties found in the 
blood. For example, a cancerous condition may be indicated by 
a combination of expression of various soluble antigens found 
in the blood along with a high number of various cellular 
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antigens found on lymphocytes and/or -particular cell 
degradation products. With a substrate as provided herein, a 
large number of different features can be simultaneously 
performed on a biological sample- In fact, the high resolution 
of the test will allow more complete characterization of 
parameters which define particular diseases. Thus, the power 
of diagnostic tests may be limited by the extent of statistical 
correlation with a particular condition rather than with the 
number of antigens or interactions which are tested. The 
present invention provides the means to generate this large 
universe of possible reagents and the ability to actually 
accumulate that correlative data. 

In another embodiment, a substrate as provided herein 
may be used for genetic screening. This would allow for 
simultaneous screening of thousands of genetic markers. As the 
density of the matrix is increased, many more molecules can be 
simultaneously tested. Genetic screening then becomes a 
simpler method as the present invention provides the ability to 
screen for thousands, tens of thousands, and hundreds of 
thousands, even millions of different possible genetic 
features. However, the number of high correlation genetic 
markers for conditions numbers only in the hundreds. Again, 
the possibility for screening a large number of sequences 
provides the opportunity for generating the data which can 
provide correlation between sequences and specific conditions 
or susceptibility. The present invention provides the means to 
generate extremely valuable correlations useful for the genetic 
detection of the causative mutation leading to medical 
conditions. In still another embodiment, the present invention 
would be applicable to distinguishing two individuals having 
identical genetic compositions. The antibody population within 
an individual is dependent both on genetic and historical 
factors. Each individual experiences a unique exposure to 
various infectious agents, and the combined antibody expression 
is partly determined thereby. Thus, individuals may also be 
fingerprinted by their immunological content, either of 
actively expressed antibodies, or their immunological memory. 
Similar sorts of immunological and environmental histories may 
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be useful for fingerprinting, perhaps in combination with other 
screening properties. In particular, the present invention may 
be useful for screening allergic reactions or susceptibilities, 
a simple IgE specificity test may be useful in determining a 
spectrum of allergies* 

With the definition of new classes of cells, a cell 
sorter will be used to purify them. Moreover, new markers for 
defining that class of cells will be identified. For example, 
where the class is defined by its RNA content, cells may be 
screened by antisense probes which detect the presence or 
absence" of specific sequences therein* Alternatively, cell 
lysates may provide information useful in correlating 
intracellular properties with extracellular markers which 
indicate functional differences. Using standard cell sorter 
technology with a fluorescence or labeled antisense probe which 
recognizes the internal presence of the specific sequences of 
interest, the cell sorter will be able to isolate a relatively 
homogeneous population of cells possessing the particular 
marker. Using successive probes the sorting process should be 
able to select for cells having a combination of a large number 
of different markers. 

In a non-polynucleotide embodiment, cells may be 
defined by the presence of other markers. The markers may be 
carbohydrates, proteins, or other molecules. Thus, a substrate 
having particular specific reagents, e.g., antibodies, attached 
to it should be able to identify cells having particular 
patterns of marker expression. Of course, combinations of 
these made be utilized and a cell class may be defined by a 
combination of its expressed mRNA, its carbohydrate expression, 
its antigens, and other properties. This fingerprinting should 
be useful in determining the physiological state of a cell or 
population of cells. 

Having defined a cell type whose function or 
properties are defined by the reagents attachable to a VLSIPS 
substrate, such as cellular antigens, these structural 
manifestations of function may be used to sort cells to 
generate a relatively homogeneous population of that class of 
cells. Standard cell sorter technology may be applied to 
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purify such a population, see, e.g.",- Dangl, J. and Herzenberg 
(1982) "Selection of hybridomas and hybridoma variants using 
the fluorescence activated cell sorter," J . Immuno logical 
Methods 52:1-14; and Becton Dickinson, Fluorescence Activated 
5 Cell Sorter Division, San Jose, California, and Coulter 
Diagnostics , Hialeah , Florida . 

With" £he fingerprinted method as in identification 
means arises from mosaism problems in an organism. A mosaic 
organism is one whose genetic content in different cells is 
10 significantly different. Various clonal populations should 
have similar genetic fingerprints, though different- clonal 
populations may have different genetic contents. See, for 
example, Suzuki et al . An intro duction to Genetic Analysis (4th 
Ed.)/ Freeman and Co., New York, which is hereby incorporated 
15 herein by reference.' However, this problem should be a 

relatively rare problem and could be more carefully evaluated 
with greater experience using the fingerprinting methods. 

The invention will also find use in detecting 
changes, both genetic and antigenic, e.g., in a rapidly 
20 "evolving" protozoa infection, or similarly changing organism. 

V. MAPPING 

A. General 

The use of the present invention for mapping 
25 parallels its use for fingerprinting and sequencing. Where a 
polymer is a linear molecule, the mapping provides the ability 
to locate particular segments along the length of the polymer. 
Branched polymers can be treated as a series of individual 
linear polymers. The mapping provides the ability to locate, 
3 0 in a relative sense, the order of various subsequences. This 
may be achieved using at least two different approaches. 

The first approach is to take the large sequence and 
fragment it at specific points. The fragments are then ordered 
and attached to a solid substrate. For example, the clones 
3 5 resulting from a chromosome walking process may be individually 
attached to the substrate by methods, e.g., caged biotin 
techniques, indicated earlier. Segments of unknown map 
J position will be exposed to the substrate and will hybridize to 
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the segment which contains that particular sequence. This 
procedure allows the rapid determination of a number of 
different labeled segments, each mapping requiring only a 
single hybridization step once the substrate is generated* The 
substrate may be regenerated by removal of the interaction, and 
the next mapping segment applied. 

In an alternative method, a plurality of subsequences 
can be attached to a substrate. Various short probes may be 
applied to determine which segments may contain particular 
overlaps. The theoretical basis and a description of this 
mapping procedure is contained in, e.g., Evans et al. 1989 
"Physical Mapping of Complex Genomes by Cosmid Multiplex 
Analysis," Proc. Natl. Acad. Sci. USA 86:5030-5034, and other 
references cited above in the Section labeled "Overall 
Description." Using" this approach, the details of 'the mapping 
embodiment are very similar to those used in the fingerprinting 
embodiment, 

B. Preparation of Substrate Matrix 
The substrate may be generated in either of the 
methods generally applicable in the sequencing and . 
fingerprinting embodiments . The substrate may be made either 
synthetically, or by attaching otherwise purified probes or 
sequences to the matrix. The probes, or sequences may be 
derived either from synthetic or biological means. As 
indicated above, the solid phase substrate synthetic methods 
may be utilized to generate a matrix with positionally defined 
sequences. In the mapping embodiment, the importance of 
saturation of all possible subsequences of a preselected length 
is far less important than in the sequencing embodiment, but 
the length of the probes used may be desired to be much longer. 
The processes for making a substrate which has longer 
oligonucleotide probes should not be significantly different 
from those described for the sequencing embodiments, but the 
optimization parameters may be modified to comply with the 
mapping needs. 
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C. Label inq 

The labeling methods will be similar to those 
applicable in sequencing and fingerprinting embodiments. 
Again, the target sequences may be desired to be fragmented. 

D. Hybridization/Specific Interaction 

The specificity of interaction between the targets , 
and probe would typically be closer to those used for 
fingerprinting embodiments, where homology is more important 
than absolute distinguishability of high fidelity complementary 
hybridization. Usually, the hybridization conditions will be 
such that merely homologous segments will interact and provide 
a positive signal. Much like the fingerprinting embodiment, it 
may be useful to measure the extent of homology by successive 
incubations at higher stringency conditions. Or, a plurality 
of different probes, each having various levels of homology may 
be used. In either way, the spectrum of homologies can be 
measured. 

Where non-nucleic acid hybridization is involved, the 
specific interactions may also be compared in a fingerprint- 
like manner. The specific reagents may have less specificity / 
e.g. , monoclonal antibodies which recognize a broader spectrum 
of sequences may be utilized relative to a sequencing 
embodiment. Again, the specificity of interaction may be 
measured under various conditions of increasing stringency to 
determine the spectrum of matching across the specific probes 
selected, or a number of different stringency reagents may be 
included to indicate the binding affinity. 

E. Detection 

The detection methods used in the mapping procedure 
will be virtually identical to those used in the fingerprinting 
embodiment. The detection methods will be selected in 
combination with the labeling methods. 

F - Analysis 

The analysis of the data in a mapping embodiment will 
typically be somewhat different from that in fingerprinting. 
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The fingerprinting embodiment will test for the presence or 
absence of specific or homologous segments ♦ However, in the 
mapping embodiment, the existence of an interaction is coupled 
with some indication of the location of the interaction. The 
interaction is mapped in some manner to the physical polymer 
sequence. Some means for determining the relative positions of 
different probe's f is performed. This may be achieved by 
synthesis of the substrate in pattern, or may result from 
analysis of sequences after they have been attached to the 
substrate. 

-For example, the probes may be randomly positioned at 
various locations on* the substrate. However, the relative 
positions of the various reagents in the original polymer may 
be determined by using short fragments, e.g., individually, as 
target molecules which determine the proximity of different 
probes. By an automated system of testing each different short 
fragment of the original polymer, coupled with proper analysis, 
it will be possible to determine which probes are adjacent one 
another on the original target sBqxiBnce and correlate that with 
positions on the matrix. In this way, the matrix is useful for 
determining the relative locations of various new segments in 
the original target molecule. This sort of analysis is 
described in Evans, and the related references described above* 

G . Substrate Reuse 

The substrate should be reusable in the manner 
described in the fingerprinting section. The substrate is 
renewed by removal of the specific interactions and is washed 
and prepared for successive cycles of exposure to new target 
sequences . 

H - Non-polvnucleotide Aspects 

The mapping procedure may be used on other molecules 
than polynucleotides. Although hybridization is one type of 
specific interaction which is clearly useful for use in this 
mapping embodiment, antibody reagents may also be very useful. 
In the same way that polypeptide sequencing or other polymers 
may be sequenced by the reagents and techniques described in 
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the sequencing section and fingerprinting section, the mapping 
embodiment may also be used similarly. 

In another form of mapping, as described above in the 
fingerprinting section f the developmental map of a cell or 
biological system may be measured using fingerprinting type 
technology. Thus p the mapping may be along a temporal 
dimension rathe** than along a polymer dimension. The mapping 
or fingerprinting embodiments may also be used in determining 
the genetic rearrangements which may be genetically important, 
as in .lymphocyte and B-cell development. In another example, 
various "rearrangements or chromosomal dislocations may be 
tested by either the fingerprinting or mapping methods. These 
techniques are similar in many respects and the fingerprinting 
and mapping embodiments may overlap in many respects. 

VI. ADDITIONAL SCREENING AND APPLICATIONS 

A. Specific Interactions 

As originally indicated in the parent filing of 
VLSIPS, the production of a high density plurality of spatially 
segregated polymers provides the ability to generate a very 
large universe or repertoire of individually and distinct 
sequence possibilities. As indicated above, particular 
oligonucleotides may be synthesized in automated fashion at 
specific locations on a matrix. In fact, these 
oligonucleotides may be used to direct other molecules to 
specific locations by linking specific oligonucleotides* to 
other reagents which are in batch exposed to the matrix and 
hybridized in a complementary fashion to only those locations 
where the complementary oligonucleotide has been synthesized on 
the matrix. This allows for spatially attaching a plurality of 
different reagents onto the matrix instead of individually 
attaching each separate reagent at each specific location- 
Although the caged biotin method allows the automated 
attachment, the speed of the caged biotin attachment process is 
relatively slow and requires a separate reaction for each 
reagent being attached. By use of the oligonucleotide method, 
the specificity of position can be done in an automated and 
parallel fashion. As each reagent is produced, instead of 
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directly attaching each reagent at "each desired position , the 
reagent may be attached to a specific desired complementary 
oligonucleotide which will ultimately be specifically directed 
toward locations on the matrix having a complementary 
oligonucleotide attached thereat. 

In addition, the technology allows screening for 
specificity of interaction with particular reagents. For 
example, the oligonucleotide sequence specificity of binding of 
a potential reagent may be tested by presenting to the reagent 
all of -the possible subsequences available for binding. 
Although secondary or higher order sequence specific features 
might not be easily screenable using this technology, it does 
provide a convenient, simple, quick, and thorough screen of 
interactions between a reagent and its target recognition 
sequences. See f e.g., Pfeifer et al. (1989) Science 246:810- 
812 . 

For example, the interaction of a promoter protein 
with its target binding sequence may be tested for many 
different, or all, possible binding sequences. By testing the 
strength of interactions under various different conditions, 
the interaction of the promoter protein with each of the 
different potential binding sites may be analyzed. The 
spectrum of strength of interactions with each different 
potential binding site may provide significant insight into the 
types of features which are important in determining 
specificity. 

An additional example of a sequence specific 
interaction between reagents is the testing of binding of a 
double stranded nucleic acid structure with a single stranded 
oligonucleotide. Often, a triple stranded structure is 
produced which has significant aspects of sequence specificity. 
Testing of such interactions with either sequences comprising 
only natural nucleotides, or perhaps the testing of nucleotide 
analogs may be very important in screening for particularly 
useful diagnostic or therapeutic reagents. See, e.g. , Haner 
and Dervan (1990) Biochemistry 29:9761-6765, and references 
therein. 
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B. Sequence Comparisons * - 

Once a gene is sequenced, the present invention 
provides means to compare alleles or related sequences to 
locate and identify differences from the control sequence, * 
This would be extremely useful in further analysis of genetic 
variability at a specific gene locus, 

C. Categorizations 

As indicated above in the fingerprinting and mapping 
embodiments, the present invention is also useful to define 
specif ic -stages in the temporal sequence of cells, e.g., 
development, and the resulting tissues within an organism. For 
example f the developmental stage of a cell, or population of 
cells, can be dependent upon the expression of particular 
messenger RNAs or ceilular antigens. The screening procedures 
provided allow for high resolution definition of new classes of 
cells. In addition, the temporal development of particular 
cells will be characterized by the presence or expression of 
various mRNAs. Means to simultaneously screen a plurality or 
very large number of different sequences as provided. The 
combination of different markers made available dramatically 
increases the ability to distinguish fairly closely related 
cell types. Other markers may be combined with markers and 
methods made available herein to define new classifications of 
biological samples, e.g., based upon new combinations of 
markers . 

The presence or absence of particular marker 
sequences will be used to define temporal developmental stages. 
Once the stages are defined, fairly simple methods can be 
applied to actually purify those particular cells. For 
example, antisense probes or recognition reagents may be used 
with a cell sorter to select those cells containing or 
expressing the critical markers. Alternatively, the expression 
of those sequences may result in specific antigens which may 
also be used in defining cell classes and sorting those cells 
away from others. In this way, for example, it should be 
possible to select a class of omnipotent immune system cells 
which are able to completely regenerate a human immune system. 
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Based upon the cellular classes defined by "the parameters made 
available by this technology, purified classes of cells having 
identifiable differences, structural or functional, are made 
available. 

In an alternative embodiment , a plurality of antigens 
or specific binding proteins attached to the substrate may be 
used to define particular cell types. For example, subclasses 
of T-cells are defined, in part, upon the combination of 
expressed cell surface antigens. The present invention allows 
for the simultaneous screening of a large plurality of 
different antigens together. Thus, higher resolution 
classification of different T-cell subclasses becomes possible 
and, with the definitions and functional differences which 
correlate with those antigenic or other parameters, the ability 
to purify those celi types becomes available. This is 
applicable not only to T-cells, lymphocyte cells, or even to 
freely circulating cells. Many of the cells for which this 
would be most useful will be immobile cells found in particular 
tissues or organs . Tumor cells will be diagnosed or detected 
using these fingerprinting techniques. Coupled with a temporal 
change in structure, developmental classes may also be selected 
and defined using these technologies. The present invention 
also provides the ability not only to define new classes of 
cells based upon functional or structural differences, but it 
also provides the ability to select or purify populations of 
cells which share these particular properties. Standard cell 
sorting procedures using . antibody markers may be used to detect 
extracellular features. Intracellular features would also be 
amendable by introducing the label reagents into the cell. In 
particular, antisense DNA or RNA molecules may be introduced 
into a cell to detect RNA sequences therein. See, e.g. , 
Weintraub (1990) Scientific American 262:40-46. 

D. Statistical Correlations 

In an additional embodiment, the present invention 
also allows for the high resolution correlation of medical 
conditions with various different markers. For example, the 
present technology, when applied to amniocentesis or other 
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genetic screening methods, typically- screen for tens of 
different markers at most. The present invention allows 
simultaneous screening for tens, hundreds, thousands , tens of 
thousands, hundreds of thousands, and even millions of 
different genetic sequences. Thus, applying the fingerprinting 
methods of the present invention to a sufficiently large 
population allows detailed statistical analysis to be made, 
thereby correlating particular medical conditions with 
particular markers, typically antigenic or. genetic. Tumor 
specific antigens will be identified using the present 
invention. 

Various medical conditions may be correlated against 
an enormous data base of the sequences within an individual. 
Genetic propensities and correlations then become available and 
high resolution genetic predictability and correlation become 
much more easily performed. With the enormous data base, the 
reliability of the predictions also is better tested. 
Particular markers which are partially diagnostic of particular 
medical conditions or medical susceptibilities will be 
identified and provide direction in further studies and more 
careful analysis of the markers involved. Of course, as 
indicated above in the sequencing embodiment, the present 
invention will find much use in intense sequencing projects. 
For example, sequencing of the entire human genome in the human 
genome project will be greatly simplified and enabled by the 
present invention. 

VI. FORMATION OF SUBSTRATE 

The substrate is provided with a pattern of specific 
reagents which are positionally localized on the surface of the 
substrate. This matrix of positions is defined by the 
automated system which produces the substrate. The instrument 
will typically be one similar to that described in U.S. S.N. 

07/492,462 (VLSIPS CIP) , and U.S. S.N. / , , attorney 

docket number 11509-2 B (automated VLSIPS) . The instrumentation 
described therein is directly applicable to the applications 
used here. In particular,, the apparatus comprises a substrate, 
typically a silicon containing substrate, on which positions on 
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the surface may be defined by a coordinate system of positions. 
These positions can be individually addressed or detected by 
the VLSIPS . apparatus . 

Typically, the VLSIPS apparatus uses optical methods 
used in semiconductor fabrication applications. In this way, 
masks may be used to photo-activate positions for attachment or 
synthesis of specific sequences on the substrate. These 
manipulations may be automated by the types of apparatus 
described in U.S. S.N. 07/462,492 (VLSIPS CIP) and U.S. S.N. 

/ , attorney docket number 11509-28 (automated VLSIPS) . 

Selectively removable protecting groups allow 
creation of well defined areas of substrate surf ace -having 
differing reactivities. Preferably, the protecting groups are 
selectively removed from the surface by applying a specific 
activator, such as electromagnetic radiation of a specific 
wavelength and intensity. More preferably, the specific 
activator exposes selected areas of surface to remove the 
protecting groups in the exposed areas. 

Protecting groups of the present invention are used 
in conjunction with solid phase oligomer syntheses, such as 
peptide syntheses using natural or unnatural amino acids, 
nucleotide syntheses using deoxyribonucleic and ribonucleic 
acids, oligosaccharide syntheses, and the like. In addition to 
protecting the substrate surface from unwanted reaction, the 
protecting groups block a reactive end of the monomer to 
prevent self-polymerization. For instance, attachment of a 
protecting group to the amino terminus of an activated amino 
acid, such as the N-hydroxysuccinimide-activated ester of the 
amino acid prevents the amino terminus of one monomer from 
reacting with the activated ester portion of another during 
peptide synthesis . 

Alternatively, the protecting group may be attached 
to the carboxyl group of an amino acid to prevent reaction at 
this site. Most protecting groups can be attached to either 
the amino or the carboxyl group of an amino acid, and the 
nature of the chemical synthesis will dictate which reactive 
group will require a protecting group. Analogously, attachment 
of a protecting group to the 5* -hydroxy 1 group of a nucleoside 
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during synthesis using for example/ -phosphate-triester coupling 
chemistry, prevents the 5 1 -hydroxy 1 of one nucleoside from 
reacting with the 3' -activated phosphate-triester of another. 

Regardless of the specific use r protecting groups are 
employed to protect a moiety on a molecule from reacting with 
another reagent. Protecting groups . of the present invention 
have the following characteristics: they prevent selected 
reagents from modifying the group to which they are attached; 
they are stable (that is, they remain attached) to the 
synthesis reaction conditions; they are removable under 
conditions that do not adversely affect the remaining 
structure; and once removed, do not react appreciably with the 
surface or surface-bound oligomer. The selection of a suitable 
protecting group will depend, of course, on the chemical nature 
of the monomer unit and oligomer, as well as the specific 
reagents they are to protect against. 

In a preferred embodiment, the protecting groups will 
be photoactivatable. The properties and uses of photoreactive 
protecting compounds have been reviewed. See, McCray et aJL . , 
Ann. Rev, of Biophvs. and Bionhys- Chem. (1989) 18.:239-270, 
which is incorporated herein by reference. Preferably, the 
photosensitive protecting groups will be removable by radiation 
in the ultraviolet (UV.) or visible portion of the 
electromagnetic spectrum. More preferably, the protecting 
groups will be removable by radiation in the near UV or visible 
portion of the spectrum. In some embodiments, however, 
activation may be performed by other methods such as localized 
heating, electron beam lithography, laser pumping, oxidation or 
reduction with microelectrodes , and the like. Sulfonyl 
compounds are suitable reactive groups for electron beam 
lithography. Oxidative or reductive removal is accomplished by 
exposure of the protecting group to an electric current source, 
preferably using microelectrodes directed to the predefined 
regions of the surface which are desired for activation. A 
more detailed description of these protective groups is 

^■'provided in U.S. S.N. / , attorney docket number 11509- 

2 8 (automated VLSIPS) , which is hereby incorporated herein by 
reference. 
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The density of reagents attached to a silicon 
substrate may be varied by standard procedures. The surface 
area for attachment of reagents may be increased by modifying 
the silicon surface. For example, a matte surface may be 
machined or etched on the substrate to provide more sites for 
attachment of the particular reagents. Another way to increase 
the density of reagent binding sites is to increase the 
derivitization density of the silicon. Standard procedures for 
achieving this are described , below. 

One method to control the derivatization density is 
to highly- derivatize the substrate with photochemical groups at 
high density. The substrate is then photolyzed for various 
predetermined times, which photoactivate the groups at a 
measurable rate, and react then with a capping reagent. By 
this method, the density of linker groups may be modulated by 
using a desired time and intensity of photoactivation. 

In many applications, the number of different 
sequences which may be provided may be limited by the density 
and the size of the substrate on which the matrix pattern is 
generated. In situations where the density is insufficiently 
high to allow the screening of the desired number of sequences, 
multiple substrates may be used to increase the number of 
sequences tested. Thus, the number of sequences tested may be 
increased by using a plurality of different substrates. 
Because the VLSIPS apparatus is almost fully automated, 
increasing the number of substrates does not lead to a 
significant increase in the number of manipulations which must 
be performed by humans. This again leads to greater 
reproducibility and speed in the handling of these multiple 
substrates. 

A, Instrumentation 

The concept of using VLSIPS generally allows a 
pattern or a matrix of reagents to be generated. The procedure 
for making the pattern is performed by any of a number of 
different methods. An apparatus and instrumentation useful for 
generating a high density VLSIPS substrate is described in 
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detail in U.S. S.N. 07/492,462 (VLSIPS CIP) and U.S. S.N. 

/ , , attorney docket number 11509-28 (automated VLSIPS) . 

B. Binary Masking 

The details of the binary masking are described in an 
accompanying application filed simultaneously with this, 

U.S. S.N. / , attorney docket number 11509-28 (automated 

VLSIPS) whose specification is incorporated herein by 
reference. 

_ . For example, the binary masking technique allows for 
producing "a plurality of sequences based on the selection of 
either of two possibilities at any particular location. By a 
series of binary masking steps, the binary decision may be the 
determination, on a particular synthetic cycle, whether or not 
to add any particular one of the possible subunits. By 
treating various regions of the matrix pattern in parallel , the 
binary masking strategy provides the ability to carry out 
spatially addressable parallel synthesis. 

C. Synthetic Methods 

The synthetic methods in making a substrate are 
described in the parent application, U.S. S.N. 07/492,462. The 
construction of the matrix pattern on the substrate will 
typically be generated by the use of photo-sensitive reagents. 
By use of photo-lithographic optical methods, particular 
segments of the substrate can be irradiated with light to 
activate or deactivate blocking agents, e.g. , to protect or 
deprotect particular chemical groups. By an appropriate • 
sequence of photo-exposure steps at appropriate times with 
appropriate masks and with appropriate reagents, the substrates 
can have known polymers synthesized at positionally defined 
regions on the substrate. Methods for synthesizing various 
substrates are described in U.S. S.N. 07/492,462 (VLSIPS CIP) 

and U.S. S.N. / , attorney docket number 11509-28 

(automated VLSIPS). By ^a sequential series of . these photo- 
exposure and reaction manipulations, a defined matrix pattern 
of known sequences may be generated, and is typically referred 
to as a VLSIPS substrate. in the nucleic acid synthesis 
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embodiment, nucleosides used in the synthesis of DNA by 
photolytic methods will typically be one of the two forms shown 
below: 



10 




15 



t-> — p ^ 




35 

B = Adenine, Cytosine, Guanine, or Thymine 
in I, the photolabile group at the 5« position is 
^ abbreviated NV (nitroveratryl) and in II, the group is 
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abbreviated NVOC (nitroveratryl oxycarbonyl) . Although not 
shown in Fig. C the bases (adenine, cytosine, and guanine) 
contain exocyclic NH 2 groups which must be protected during DNA 
synthesis. Thymine contains no exocyclic NH 2 and therefore 
requires no protection. The standard protecting groups for 
these anaines are shown below: 




Adenine (A) Cytosine (C) Guanine (G) 

Other amides of the general formula 




6^ 



where R may be alkyl or aryl have been used. ' 

Another type of protecting group FMOC (9-fluorenyl 
raethoxycarbonyl) is currently being used to protect the 
exocyclic amines of the three bases: 
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Adenine (A) Cytosine (C) Guanine (G) 

The advantage of the FMOC group is that it is removed 
20 under mild conditions (dilute organic bases) and can be used 

for all three bases. The amide protecting groups require more 
harsh conditions to be removed (NH 3 /MeOH with heat) . 

Nucleosides used as 5 1 -OH probes, useful in verifying 
correct VLSIPS synthetic function, have been the following: 

25 
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These compounds are used to detect where on a 
substrate photolysis has occurred by the attachment of either 
III or V to the newly generated 5' -OH. In the case of III, 
after the phosphate attachment is made, the substrate is 

5 treated with a dilute base to remove the FMOC group. The 
resulting amine can be reacted with FITC and the substrate 
examined by fluorescence microscopy. This indicates the proper 
generation of a S'-OH. In the case of compound IV, after the 
phosphate attachment is made, the substrate is treated with 

0 FITC labeled streptavidin and the substrate again may be 

examined _by fluorescence microscopy. Other probes, -although 
not nucleoside based, have included the following: 




0 

The method of attachment of the first nucleoside to 
the surface of the substrate depends on the functionality of 
the groups at the substrate surface. If the surface is amine 
functionalized, an amide bond is made (see example below) . 
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If the surface is hydroxy functionalized a phosphate 
bond is made (see example below) 



15 



20 
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in both cases, the thymidine example is illustrated, 
but any one of the four phosphoramidite activated nucleosides 

25 can be used in the first step. 

Photolysis of the photolabile group NV or NVOC on the 
5 . positions of the nucleosides is carried out at -362 rnn with 
an intensity of 14 mW/cm 2 for 10 minutes with the substrate 
side (side containing the photolabile group) immersed in 
dioxane. After the coupling of the next nucleoside is 
complete, the photolysis is repeated followed by another 
coupling until the desired oligomer is obtained. 

One of the most common 3 ' -O-protecting group is the 
ester, in particular the acetate 
35 . 




we- 



The groups can be removed* by mild base treatment 0 . IN 
NaOH/MeOH or K 2 C0 3 /H 2 P/MeOH* 

Another group used most often is the silyl ether. 



_g 
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These groups can be removed by neutral conditions 
using 1 M tetra-n-butylammonium fluoride in THF or under acid 
conditions . 

Related to photodeprotection f the nitroveratryl group 
could also be used to protect the 3 1 -position. 



^ V^?r- 

JJC^ 




Here, light (photolysis) would be used to remove 
these protecting groups. 

A variety of ethers can also be used in the 
protection of the 3 1 -O-position. 



~0~:£ ~ -^4H^ 
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Removal of these groups usually involves acid or 
catalytic methods* 

Note that corresponding linkages and photoblocked 

amino acids are described in detail in U. S.S.N. / , , 

attorney docket number 11509-28, which is hereby incorporated 
herein by reference. 

Although the specificity of interactions at 
particular locations will usually be homogeneous due to a 
homogeneous polymer being synthesized at each defined location, 
for certain purposes, it may be useful to have mixed polymers 
with a commensurate mixed collection of interactions occurring 
at specific defined locations, or degeneracy reducing 
analogues, which have been discussed above and show broad 
specificity in binding. Then, a positive interaction signal 
may result from any of a number of sequences contained therein. 

As an alternative method of generating a matrix 
pattern on a substrate, preformed polymers may be individually 
attached at particular sites on the substrate. This may be 
performed by individually attaching reagents one at a time to 
specific positions on the matrix, a process which may be 
automated. See, e.g., U. S.S.N. 07/435,316 (caged biotin 
parent), and U.S. S.N. 07/612,671 (caged biotin CIP) . Another 
way of generating a positionally defined matrix pattern on a 
substrate is to have individually specific reagents which 
interact with each specific position on the substrate. For 
example, oligonucleotides may be synthesized at defined 
locations on the substrate. Then the substrate would have on 
its surface a plurality of regions having homogeneous 
oligonucleotides attached at each position. 

In particular, at least four different substrate 
preparation procedures are available for treating a substrate 
surface. They are the standard VLSIPS method, polymeric 

rrrvr 

substrates, Durapore , and synthetic beads or fibers. The 
treatment labeled "standard VLSIPS" method is described in 

U.S. S.N. / , , attorney docket number 11509*28 (automated 

VLSIPS) , and involves applying amino-propyltriethoxysilane to a 
glass surface. 
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The polymeric substrate approach involves either of 
two ways of generating a polymeric substrate. The first uses a 
high concentration of aminopropyltriethoxysilane (2-20%) in an 
aqueous ethanol solution (95%) . This allows the silane 
compound to polymerize both in solution and on the substrate 
surface, which provides a high density of amines on the surface 
of the glass. This density is contrasted with the standard 
VLSiPs method. This polymeric method allows for the deposition 
on the substrate surface of a monolayer due to the anhydrous 
method- used with the aforementioned silane. 

The second polymeric method involves either the 
coating or covalent binding of an appropriate acrylic acid 
polymer onto the substrate surface.. In particular, e.g., in 
DNA synthesis, a monomer such as a hydroxypropylacrylate is 
used to generate a high density of hydroxyl groups on the 
substrate surface, allowing for the formation of phosphate 
bonds. An example of such a compound is shown: 




The method using a Durapore in membrane (Millipore) 
consists of a polyvinylidine difluoride coating with 
crosslinked polyhydroxylpropyl acrylate [PVDF-HPA]: 




U 

Here the building up of, e.g., a DNA oligomer, can be started 
immediately since phosphate bonds to the surface can be 
accomplished in the first step with no need for modification. 
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A nucleotide dimer (S'-C-T-S 1 ) has been successfully made on 
this substrate in our labs. 

The fourth method utilizes synthetic beads or fibers. 
This would use another substrate, such as a teflon copolymer 
graft bead or fiber, which is covalently coated with an organic 
layer (hydrophilic) terminating in hydroxyl sites (commercially 
available f rom tyolecular Brosystems r Inc.) . This would offer 
the same advantage as the Durapore™ membrane, allowing for 
immediate phosphate linkages, but would give additional contour 
by the 3 -dimensional growth of oligomers. 

- A matrix pattern of new reagents may be targeted to 
each specific oligonucleotide position by attaching. a 
complementary oligonucleotide to which the substrate bound form 
is complementary. For instance, a number of regions may have 
homogeneous oligonucleotides synthesized at various locations. 
Oligonucleotide sequences complementary to each of these can be 
individually generated and linked to a particular specific 
reagents. Often these specific reagents will be antibodies. 
As each of these is specific for finding its complementary 
oligonucleotide, each of the specific reagents will bind 
through the oligonucleotide to the appropriate matrix position. 
A single step having a combination of different specific 
reagents being attached specifically to a particular 
oligonucleotide will thereby bind to its complement at the 
defined matrix position. The oligonucleotides will typically 
then be covalently attached, using, e.g., an acridine dye, for 
photocrosslinking. Psoralen is a commonly used acridine dye 
for photocrosslinking purposes, see, e.g., Song et al. (1979) 
Photochem. Photobiol . 29:1177-1197; Cimino et al. (1985) Ann . 
Rev. Biochem . 54:1151-1193; Parsons (1980) Photochem. 
Photobiol . 32:813-821; and Dattagupta et al. (1985) U.S. Pat. 
No. 4,542,102, and (1987) U.S. Pat. Ho. 4,713,326; each of 
which is hereby incorporated herein by reference. This method 
allows a single attachment manipulation to attach all of the 
specific reagents to the matrix at defined positions and 
results in the specific reagents being homogeneously located at 
defined positions. In many embodiments, the specific reagents 
will be antibodies. 
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In an alternative embodiment, antibody molecules may 
be used to specifically direct binding to defined positions on 
a substrate. The VLSIPS technology may be used to generate 
specific epitopes at each position on the substrate. Antibody 
molecules having specificity of interaction may be used to 
attach oligonucleotides , thereby avoiding the interference of 
internal polynucleotide sequences from binding to the substrate 
complementary oligonucleotides. In fact, the specificity of 
interaction for positional targeting may be achieved by use of 
nucleotide analogues which do not interact with the natural 
nucleotides. For example, other synthetic nucleotides have 
been made which undergo base pairing, thereby providing the 
specificity of targeting, but the synthetic nucleotides also do 
not interact with the natural biological nucleotides. Thus, 
synthetic oligonucleotides would be useful for attachment to 
biological nucleotides and specific targeting. Moreover, the 
VLSIPS synthetic processes would be useful in generating the 
VLSIPS substrate, and standard oligonucleotide synthesis could 
be applied, with minor modifications, to produce the 
complementary sequences which would be attached to other 
specific reagents. 

£>• Surface Immobilization 
1. caged biotin 

An alternative method of attaching reagents in a 
positionally defined matrix pattern is to use a caged biotin 
system. See U.S. S.N. 07/612,671 (caged biotin CIP) , which is 
hereby incorporated herein by reference, for additional details 
on the chemistry and application of caged biotin* embodiments . 
In short, the caged biotin has a photosensitive blocking moiety 
which prevents the combination of avidin to biotin. At 
positions where the photo-lithographic process has removed the 
blocking group, high affinity biotin sites are generated. 
Thus, by a sequential series of photolithographic deblocking 
steps interspersed with exposure of those regions to 
appropriate biotin containing reagents, only those locations 
where the deblocking takes place -will form an avidin-biotin 
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interaction. Because the avidin-biotin binding is very tight f 
this will usually be virtually irreversible binding, 

2. crossl inked interactions 
The surface immobilization may also take place by 
photo crosslinking of defined oligonucleotides linked to 
specif ic reagents. After hybridization of the complementary 

r 

oligonucleotides, the oligonucleotides may be crosslinked by a 
reagent by psoralen or another similar type of acridine dye. 
Other useful cross linking reagents are described in Dattagupta 
et al/~*(1985) U.S. Pat. No. 4,542,102, and (1987) U..S. Pat. No. 
4,713,326. 

In another embodiment, colony or phage plaque 
transfer of biological polymers may be transferred directly 
onto a silicon substrate. For example, a colony plate may be 
transferred onto a substrate having a generic oligonucleotide 
sequence which hybridizes to another generic complementary 
sequence contained on all of the vectors into which inserts are 
cloned. This will specifically only bind those molecules which 
are actually contained in the vectors containing the desired 
complementary sequence. This immobilization allows for 
producing a matrix onto which a sequence specific reagent can 
bind, or for other purposes. In a further embodiment, a 
plurality of different vectors each having a specific 
oligonucleotide attached to the vector may be specifically 
attached to particular regions on a matrix having a 
complementary oligonucleotide attached thereto, 

VIII. HYBRIDIZATION/SPECIFIC INTERACTION 

A. General 

As discussed previously in the VLSIPS parent 
applications, the VLSIPS substrates may be used for screening 
for specific interactions with sequence specific targets or 
probes. 

In addition, the availability of substrates having 
the entire repertoire of possible sequences of a defined length 
opens up the possibility of sequencing by hybridization. This 
sequence may be de novo determination of an unknown sequence, 
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particularly of nucleic acid, verification of a sequence 
determined by another method, or an investigation of changes in 
a previously sequenced gene f locating and identifying specific 
changes* For example, often Maxam and Gilbert sequencing 
techniques are applied to sequences which have been determined 
by Sanger and Coulson. Each of those sequencing technologies 
have problems with resolving particular types of sequences. 
Sequencing by hybridization may serve as a third and 
independent method for verifying other sequencing techniques. 
See, e.g., (1988) Science 242:1245. 

- In addition, the ability to provide a large 
repertoire of particular sequences allows use of short 
subsequence and hybridization as a means to fingerprint a 
sample. This may be used in a nucleic acid, as well as other 
polymer embodiments. For example, fingerprinting to a high 
degree of specificity of sequence matching may be used for 
identifying highly similar samples, e.g., those exhibiting high 
homology to the selected probes. This may provide a means for 
determining classifications of particular sequences. This 
should allow determination of whether particular genomes of 
bacteria, phage, or even higher cells might be related to one 
another. 

In addition, fingerprinting may be used to identify 
an individual source of biological sample. See, e.g., Lander, 
E. (1989) Nature , 339:501-505, and references therein. For 
example, a DNA fingerprint may be used to determine whether a 
genetic sample arose from another individual. This would be 
particularly useful in various sorts of forensic tests to 
determine, e.g., paternity or sources of blood samples. 
Significant detail on the particulars of genetic fingerprinting 
for identification purposes are described in, e.g., Morris et 
al. (1989) "Biostatistical evolution of evidence from 
continuous allele frequency distribution DNA probes in 
reference to disputed paternity of identity," J. Forensic 
Science 34:1311-1317; and Neufeld et al. (1990) Scientific 
American 262:46-53; each of which is hereby incorporated herein 
by reference. 
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In another embodiment:, a "f-ingerprinting-like 
procedure may be used for classifying cell types by analyzing a 
pattern of specific nucleic acids present in the cell. A 
series of antibodies may be used to identify cell markers, 
e.g., proteins, usually on the cell surface, but intracellular 
markers may also be used. Antigens which are extracellularly 
expressed are preferred so cell lysis is unnecessary in the 
screening, but intracellular markers may also be useful. The 
markers will usually be proteins, but may be nucleic acids, 
lipids, metabolites, carbohydrates, or other cellular 
components. See, e.g., Winkelgren, I. (1990) Science News 
136:234-237, which indicates extracellular DNA may common, and 
suggesting that such might be characteristic of cell types, 
stage, or physiology. This may also be useful in defining the 
temporal stage of development of cells, e.g., stem cells or 
other cells which undergo temporal changes in development. For 
example, the stage of a cell, or group of cells, may be tested 
or defined by isolating a sample of mRNA from the population 
and testing to see what sequences are present in messenger 
populations. Direct samples, or amplified samples, may be 
used. Where particular mRNA or other nucleic acid sequences 
may be characteristic of or shown to be characteristic of 
particular developmental stages, physiological states, or other 
conditions, this fingerprinting method may define them. 
Similar sorts of fingerprinting may be used for determining T- 
cell classes or perhaps even to generate classification schemes 
for such proteins as major histocompatibility complex antigens. 
Thus, the ability to make these substrates allows both the 
generation of reagents which will be used for defining 
subclasses or classes of cells or other biological materials, 
but also provides the mechanisms for selecting those cells 
which may be found in defined population groups. 

Cell classification defined by such a combination of 
properties, typically* expression of extracellular antigens", the 
present invention also provides the means for isolating 
homogeneous population of cells. Once the antigenic 
determinants which define a cell- class have been identified, 
these antigens may be used in a sequential selection process to 
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isolate only those cells which exhibit the combination of 
defining structural properties. 

The present invention may also be used for mapping 
sequences within a larger segment. This may be performed by at 
least two methods, particularly in reference to nucleic acids. 
Often , enormous segments of DNA are subcloned into a large 
plurality of subsequences. Ordering these subsequences may be 
important in determining the overlaps of sequences upon 
nucleotide determinations. Mapping may be performed by 
immobilizing particularly large segments onto a matrix using 
the VLSIPS technology. Alternatively, sequences may be ordered 
by virtue of subsequences shared by overlapping segments. See, 
e.g. r Craig et al. (1990) Nuc. Acids Res. 18:2653-2660; 
Michiels et al. (1987) CABIOS 3:203-210; and Olson et al . 
(1986) Proc. Natl. Acad. Sci. USA 83:7826-7830. 

B. Important Parameters 

The extent of specific interaction between reagents 
immobilized to the VLSIPS substrate and another sequence 
specific reagent may be modified by the conditions of the 
interaction. Sequencing embodiments typically require high 
fidelity hybridization and the ability to discriminate perfect 
matching from imperfect matching. Fingerprinting and mapping 
embodiments may be performed using less stringent conditions, 
depending upon the circumstances. 

For example, the specificity of antibody/antigen 
interaction may depend upon such parameters as pH f salt 
concentration, ionic composition, solvent composition, 
detergent composition and concentration, and chaotropic agent 
concentration. See, e.g., Harlow and Lane (1988) Antibodies : 
A Laboratory Manual , Cold Spring Harbor Press, New York. By 
careful control of these parameters, the affinity of binding 
may be mapped across different sequences. 

In a nucleic acid hybridization embodiment, the 
specificity and kinetics of hybridization have been described 
in detail by, e.g., Wetmur and Davidson (1968) J . Mol . Biol . , 
31:349-370, Britten and Kohne (1968) Science 161:529-530, and 
Kanehisa, (1984) Nuc. Acids Res. 12:203-213, each of which is 
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hereby incorporated herein by reference. Parameters which are 
well known to affect specificity and kinetics of reaction 
include salt conditions, ionic composition of the solvent, 
hybridization temperature, length of oligonucleotide matching 
sequences, guanine and cytosine (GC) content, presence of 
hybridization accelerators, pH, specific bases found in the 
matching sequences, solvent conditions, and addition of organic 
solvents. 

In particular, the salt conditions required for 
driving highly mismatched sequences to completion typically 
include a high salt concentration. The typical salt used is 
sodium chloride (NaCl) , however, other ionic salts may be 
utilized, e.g., KC1. Depending on the desired stringency 
hybridization, the salt concentration will often be less than 
about 3 molar, more often less than 2.5 molar, usually less 
than about 2 molar, and more usually less than about 1.5 molar. 
For applications directed towards higher stringency matching, 
the salt concentrations would typically be lower. Ordinary 
high stringency conditions will utilize salt concentration of 
less than about 1 molar, more often less then about 750 
millimolar, usually less than about 500 millimolar, and may be 
as low as about 250 or 150 millimolar. 

The kinetics of hybridization and the stringency of 
hybridization both depend upon the temperature at which the 
hybridization is performed and the temperature at which the 
washing steps are performed. Temperatures at which steps for 
low stringency hybridization are desired would typically be 
lower temperatures, e.g., ordinarily at least about *15'C, more 
ordinarily at least about 20*C, usually at least about 25*C, 
and more usually at least about 3 0 *C. For those applications 
requiring high stringency hybridization, or fidelity of 
hybridization and sequence matching, temperatures at which 
hybridization and washing steps are performed would typically 
be high. For example, temperatures in excess of about 35 'C 
would often be used, more often in excess of about 4 0*C, 
usually at least about 45 *C, and occasionally even temperatures 
as high as about 50 *C or 60 % C or more. Of course, the 
hybridization of oligonucleotides may be disrupted by even 
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higher temperatures. Thus, for stripping of targets from 
substrates, as discussed below, temperatures as high as 8 0 
or even higher may be used. 

The base composition of the specific oligonucleotides 
involved in hybridization affects -the temperature of melting, 
and the stability of hybridization as discussed in the above 
references. However, the bias of GCrich sequences to • 
hybridize faster and retain stability at higher temperatures 
can be compensated for by the inclusion in the hybridization 
incubation or wash steps of various buffers. Sample buffers 
which accomplish this result include the triethly-and trimethyl 
ammonium buffers. See, e.g. , Wood et al. (1987) Froc. Natl. 
Acad. Sci. USA , 82:1585-1588, and Khrapko, K. et al. (1989) 
FEBS Letters 256:118-122. 

The rate of hybridization can also be affected by the 
inclusion of particular hybridization accelerators. These 
hybridization accelerators include the volume exclusion agents 
characterized by dextran sulfate, or polyethylene glycol (PEG) . 
Dextran sulfate is typically included at a concentration of 
between 1% and 4 0% by weight. The actual concentration 
selected depends upon the application, but typically a faster 
hybridization is desired in which the concentration is 
optimized for the system in question. Dextran sulfate is often 
included at a concentration of between 0.5% and 2% by weight or 
dextran sulfate at a concentration between about 0.5% and 5%. 
Alternatively, proteins which accelerate hybridization may be 
added, e.g., the recA protein found in E. coli) or other 
homologous proteins. 

With respect to those embodiments where specific 
reagents are not oligonucleotides, the conditions of specific 
interaction would depend on the affinity of binding between the 
specific reagent and its target. Typically parameters which 
would be of particular importance would be pH, salt 
concentration anion and cation compositions, buffer 
concentration, organic solvent inclusion, detergent 
concentration, and inclusion of such reagents such as 
chaotropic agents. In particular, the affinity of binding may 
be tested over a variety of conditions by multiple washes and 
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repeat scans or by using reagents with differences in binding 
affinity to determine which reagents bind or do not bind under 
the selected binding and washing conditions. The spectrum of 
binding affinities may provide an additional dimension of 
information which may be very useful in identification purposes 
and mapping. 

Of coyirse, the specific hybridization conditions will 
be selected to correspond to a discriminatory condition which 
provides a positive signal where desired but fails to show a 
positive signal at affinities where interaction is not desired. 
This may- be determined by a number of titration steps or with a 
number of controls which will be run during the hybridization 
and/ or washing steps to determine at what point the 
hybridization conditions have reached the stage of desired 
specificity. 

IX- DETECTION METHODS 

Methods for detection depend upon the label selected. 
The criteria for selecting an appropriate label are discussed 
below , however, a fluorescent label is preferred because of its 
extreme sensitivity and simplicity. Standard labeling 
procedures are used to determine the positions where 
interactions between a sequence and a reagent take place. For 
example, if a target sequence is labeled and exposed to a 
matrix of different probes , only those locations where probes 
do interact with the target will exhibit any signal. 
Alternatively, other methods may be used to scan the matrix to 
determine where interaction takes place. Of course, the 
spectrum of interactions may be determined in a temporal manner 
by repeated scans of interactions which occur at each of a 
multiplicity of conditions. However, instead of testing each 
individual interaction separately, a multiplicity of sequence 
interactions may be simultaneously determined on a matrix. 

A. Labeling Techniques 

The target polynucleotide may be labeled by any of a 
number of convenient detectable markers. A fluorescent label 
is preferred because it provides a very strong signal with low 
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background. It is also optically detectable at high resolution 
and sensitivity through a quick scanning procedure. Other 
potential labeling moieties include, radioisotopes, 
chemiluminescent compounds, labeled binding proteins, heavy 
metal atoms, spectroscopic markers, magnetic labels, and linked 
enzymes. 

Anothepr method for labeling may bypass any label of 
the target sequence. The target may be exposed to the probes, 
and a double strand hybrid is formed at those positions only. 
Addition of a double strand specific reagent will detect where 
hybridization takes place. An intercalative dye such as 
ethidium bromide may be used as long as the probes themselves 
do not fold back on themselves to a significant extent forming 
hairpin loops. See, e.g., Sheldon et al. (1986) U.S. Pat. No. 
4,582,789. However, the length of the hairpin loops in short 
oligonucleotide probes would typically be insufficient to form 
a stable duplex. 

In another embodiment, different targets may be 
simultaneously sequenced where each target has a different 
label. For instance, one target could have a green fluorescent 
label and a second target could have a red fluorescent label. 
The scanning step will, distinguish sites of binding of the red 
label from those binding the green fluorescent label. Each 
sequence can be analyzed independently from one another. 

Suitable chromogens will include molecules and 
compounds which absorb light in a distinctive range of 
wavelengths so that a color may be observed, or emit 'light when 
irradiated with radiation of a particular wave length or wave 
length range, e.g., fluorescers. Biliproteins, e.g., 
f icoerythrin, may also serve as labels. 

A wide variety of suitable dyes are available, being 
primary chosen to provide an intense color with minimal 
absorption by their surroundings. Illustrative dye types 
include quinoline dyes, triarylmethane dyes, acridine dyes, 
alizarine dyes, phthaleins, insect dyes, azo dyes, 
anthraquinoid dyes, cyanine dyes, phenazathionium dyes, and 
phenazoxonium dyes. 
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A vide variety of fluorescers may be employed either 
by themselves or in conjunction with quencher molecules. 
Fluorescers of interest fall into a variety of categories 
having certain primary functionalities. These primary 
functionalities include 1- and 2-aminonaphthalene , p,p'- 
diaminostilbenes , pyrenes, quaternary phenanthridine salts, 9- 
aminoacridines; f p f p '-diaminobenzophenone imines, anthracenes, 
oxacarbocyanine , merocyanine, 3-aminoequilenin, perylene, bis* 
benzoxazole, bis-p-oxazolyl benzene, 1,2-benzophenazin, 
retinoid bis-3-aminopyridinium salts, hellebrigenin, 
tetracycline , sterophenol , benzimidzaolylphenylamine , 2-oxo-3- 
chromen, indole, xanthen, 7-hydroxycoumarin, phenoxazine, 
salicylate, strophanthidin, porphyrins, triarylmethanes and 
flavin. Individual fluorescent compounds which have 
functionalities for linking or which can be modified to 
incorporate such functionalities include, e.g. , dansyl 
chloride; fluoresceins such as 3 , 6-dihydroxy-9- 
phenylxanthhydrol ; rhodamineisothiocyanate; N-phenyl l-amino~ 
8-sulf onatonaphthalene ; N-phenyl 2-amino-6- 

sulf onatonaphthalene; 4-acetamido-4-isothiocyanato-stilbene- 
2 , 2 1 -disulf onic acid; pyrene-3-sulf onic acid; 2- 
toluidinonaphthalene-6-sulf onate; N-phenyl , N-methyl 2- 
aminoaphthalene-6-sulf onate ; ethidium bromide; stebrine; 
auromine-0 , 2 - ( 9 1 -anthroyl) palmitate ; dansyl 

phosphatidylethanolamine ; N , N 1 -dioctadecyl oxacarbocyanine ; 
N,N'-dihexyl oxacarbocyanine; merocyanine, 4- 
(3 , pyrenyl)butyrate; d-3-aminodesoxy-equilenin; 12- (9 ' - 
anthroyl ) stearate ; 2-methylanthracene ; 9-vinylanthracene ; 2,2 
( vinylene-p-phenylene) bisbenzoxazole ? . p-bis [ 2- ( 4 -methyl -5- 
phenyl-oxazolyl) ] benzene; 6-dimethylamino-l , 2-benzophenazin ; 
retinol ; bis (3 ' -aminopyridinium) 1 , 10-decandiyl diiodide ; 
sulf onaphthylhydrazone of hellibrienin; chlorotetracycline ; N 
(7-dimethylamino-4-methyl-2-oxo-3-chromenyl) maleimide; N-[p- 
(2~benzimidazolyl) -phenyl ]maleimide; N- (4- 

fluoranthyl) maleimide; bis (homovanillic acid); resazarin; 4- 
chloro-7-nitro-2 , 1, 3-benzooxadiazole; merocyanine 54 0 ; 
resorufin; rose bengal; and 2 , 4-diphenyl-3 (2H) -furanone. 
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Desirably, fluorescers should absorb light above 
about 3 00 nm, preferably about 350 nm, and more preferably 
above about 400 nm, usually emitting at wavelengths greater 
than about 10 nm higher than the wavelength of the light 
absorbed. It should be noted that the absorption and emission 
characteristics of the bound dye may differ from the unbound 
dye* Therefore, when referring to the various wavelength 
ranges and characteristics of the dyes, it is intended to 
indicate the dyes as employed and not the dye which is 
unconjugated and characterized in an arbitrary solvent. 

" Fluorescers are generally preferred because by 
irradiating a fluorescer with light , one can obtaima plurality 
of emissions. Thus, a single label can provide for a plurality 
of measurable events. 

Detectable signal may also be provided by 
chemiluminescent and bioluminescent sources. Chemiluminescent 
sources include a compound which becomes electronically excited 
by a chemical reaction and may then emit light which serves as 
the detectible signal or donates energy to a fluorescent 
acceptor. A diverse number of families of compounds have been 
found to provide chemiluminescence under a variety of 
conditions. One family of compounds is 2 , 3-dihydro-l , -4- 
phthalazinedione. The most popular compound is luminol, which 
is the 5-amino compound. Other members of the family include 
the 5-amino-6, 7 , 8-trimethoxy- and the dimethylamino[ca]benz 
analog. These compounds can be made to luminesce with alkaline 
hydrogen peroxide or calcium hypochlorite and base. Another 
family of compounds is the 2 , 4 , 5-triphenylimidazoles , with 
lophine as the common name for the parent product. 
Chemiluminescent analogs include para-dimethylamino and 
-methoxy substituents. Chemiluminescence may also be obtained 
with oxalates, usually oxalyl active esters, e.g., p- 
nitrophenyl and a peroxide, e.g., hydrogen peroxide, under 
basic conditions. Alternatively, luciferins may be used in 
conjunction with luciferase or lucigenins to provide 
bioluminescence . 

Spin labels are provided by reporter molecules with 
an unpaired electron spin which can be detected by electron 
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spin resonance (ESR) spectroscopy. Exemplary spin labels 
include organic free radicals, transitional metal complexes, 
particularly vanadium, copper, iron, and manganese, and the 
like. Exemplary spin labels include nitroxide free radicals, 

5 

B. Scanning System 

With't^ie automated detection apparatus, the 
correlation of specific positional labeling is converted to the 
presence on the target of sequences for which the reagents have 

10 specificity of interaction. Thus, the positional information 
is directly converted to a database indicating what -sequence 
interactions have occurred. For example, in a nucle.ic acid 
hybridization application, the sequences which have interacted 
between the substrate " matrix and the target molecule can be 

15 directly listed from the positional information. The detection 
system used is described in U. S.S.N. 07/649,642 (VLSIPS CIP) ; 

and U.S. S.N. / , , attorney docket number 11509-28 

(automated VLSIPS) . Although the detection described therein 
is a fluorescence detector, the detector may be replaced by a 

2 0 spectroscopic or other detector. The scanning system may make 
use of a moving detector relative to a fixed substrate, a fixed 
detector with a moving substrate or a combination. 
Alternatively, mirrors or other apparatus can be used to 
transfer the signal directly to the detector. See, e.g, 

2 5 U.S. S.N. / , , attorney docket number 11509-28 (automated 

VLSIPS) , which is hereby incorporated herein by reference. 

The detection method will typically also incorporate 
some signal processing to determine whether the signal at a 
particular matrix position is a true positive or may be a 

3 0 spurious signal. For example, a signal from a region which has 

actual positive signal may tend to spread over and provide a 
positive signal in an adjacent region which actually should not 
have one. This may occur, e.g., where the scanning system is 
not properly discriminating with sufficiently high resolution 
35 in its pixel density to separate the two regions. Thus, the 

signal over the spatial region may be evaluated pixel by pixel 
to determine the locations and the actual extent of positive 
signal. A true positive signal should, in theory, show a 
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uniform signal at each pixel location. Thus, processing by 
plotting number of pixels with actual signal intensity should 
have a clearly uniform signal intensity* Regions where the 
signal intensities show a fairly wide dispersion, may be 
particularly suspect and the scanning system may be programmed 
to more carefully scan those positions ♦ 

In a'nqther embodiment, as the sequence of a target is 
determined at a particular location, the overlap for the 
sequence would necessarily have a known sequence. Thus, the 
system can compare the possibilities for the next adjacent 
position*and look at these in comparison with each other. 
Typically, only one of the possible adjacent sequences should 
give a positive signal and the system might be programmed to 
compare each of these possibilities and select that one which 
gives a strong positive. In this way, the system can also 
simultaneously provide some means of measuring the reliability 
of the determination by indicating what the average signal to 
background ratio actually is. 

More sophisticated signal processing techniques can 
be applied to the initial determination of whether a positive 

signal exists or not. See, e.g., U.S.S-N. / , , attorney 

docket number 11509-28 (automated VLSIPS) . - 

From a listing of those sequences which interact, 
data analysis may be performed on a series of sequences. For 
example, in a nucleic acid sequence application, each of the 
sequences may be analyzed for their overlap regions and the 
original target sequence may be reconstructed from the 
collection of specific subsequences obtained therein. Other 
sorts of analyses for different applications may also be 
performed, and because the scanning system directly interfaces 
with a computer the information need not be transferred 
manually. This provides for the ability to handle large 
amounts of data with very little human intervention. - This, of 
course, provides significant advantages over manual 
manipulations. Increased throughput and reproducibility is 
thereby provided by the automation of vast majority of steps 
in any of these applications. 
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XI. DATA ANALYSIS 
A. General 

Data analysis will typically involve aligning the 
proper sequences with their overlaps to determine the* target 
sequence. .Although the target "sequence" may not specifically 
correspond to any specific molecule , especially where the 
target sequence is broken and fragmented up in the sequencing 
process, the sequence corresponds to a contiguous sequence of 
the subf ragments . 

The data analysis can be performed by a computer 
using an -appropriate program. See, e.g., Drmanac, R. et al. 
(1989) Genomics 4:114-128; and a " commercially available 
analysis program available from the Genetic Engineering Center, 
P.O. Box 7 94, 11000 Belgrade, Yugoslavia. Although the 
specific manipulations necessary to reassemble the target 
sequence from fragments may take many forms, one embodiment 
uses a sorting program to sort all of the subsequences using a 
defined hierarchy. The hierarchy need not necessarily 
correspond to any physical hierarchy, but provides a means to 
determine, in order, which subf ragments have actually been 
found in the target sequence. In this manner, overlaps can be 
checked and found directly rather than having to search 
throughout the entire set after each selection process. For 
example, where the oligonucleotide probes are 10-mers, the 
first 9 positions can be sorted. A particular subsequence can 
be selected as in the examples, to determine where the process 
starts. As analogous to the theoretical example provided 
above, the sorting procedure provides the ability to 
immediately find the position of the subsequence which contains 
the first 9 positions and can compare whether there exists more 
than 1 subsequence during the first 9 positions. In fact, the 
computer can easily generate all of the possible target 
sequences which contain given combination of subsequences. 
Typically there will be only one, but in various situations, 
there will be more. 

An exemplary flow chart for a sequencing program is 
provided in Figure 4. In general terms, the program provides 
for automated scanning of the substrate to determine the 
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positions of probe and target interaction* Simple processing 
of the intensity of the signal may be incorporated to filter 
out clearly spurious signals. The positions with positive 
interaction are correlated with the sequence specificity of 
specific matrix positions, to generate the set of matching 
subsequences. This information is further correlated with 
other target sequence information, e.g., restriction fragment 
analysis. The sequences are then aligned using overlap data, 
thereby leading to possible corresponding target sequences 
which will, optimally, correspond to a single target sequence. 

B. Hardware . ft 

A variety of computer systems may be used to run a 
sequencing program. The program may be written to provide both 
the detecting and scanning steps together and will typically be 
dedicated to a particular scanning apparatus* However, the 
components and functional steps may be separated and the 
scanning system may provide an output, e.g., through tape or an 
electronic connection into a separate computer which separately 
runs the sequencing analysis program. The computer may be any 
of a number of machines provided by standard computer 
manufacturers, e.g., IBM compatible machines, Apple™ machines, 
VAX machines, and others, which may often use a UNIX™ 
operating system. Of course, the hardware used to run the 
analysis program will typically determine what programming 
language would be used. 

C. Software 

Software would be easily developed by a person of 
ordinary skill in the programming art, following the flow chart 
provided, or based upon the input provided and the desired 
result. 

Of course, an exemplary embodiment is a 
polynucleotide sequence system. However, the theoretical and 
mathematical manipulations necessary for data analysis of other 
linear molecules, such as polypeptides, carbohydrates, and 
various other polymers are conceptually similar. Simple 
branching polymers will usually also be sequencable using 
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similar technology. However, where there is branching, it may 
be desired that additional recognition reagents be used to 
determine the nature and location of branches. This can easily 
be provided by use of appropriate specific reagents which would 
be generated by methods similar to those used to produce 
specific reagents for linear polymers. 

r 

XXI. SUBSTRATE REUSE 

Where a substrate is made with specific reagents that 
are relatively insensitive to the handling and processing steps 
involved in a single cycle of use, the substrate may often be 
reused. The target molecules are usually stripped pff of the 
solid phase specific recognition molecules. Of course, it is 
preferred that the manipulations and conditions be selected as 
to be mild and to not affect the substrate. For example, if a 
substrate is acid labile, a neutral pH would be preferred in 
all handling steps. Similar sensitivities would be carefully 
respected where recycling is desired. 

A. Removal of Label 

Typically for a recycling, the previously attached 
specific interaction would be disrupted and removed. This will 
typically involve exposing the substrate to conditions under 
which the interaction between probe and target is disrupted. 
Alternatively, it may be exposed to conditions where the target 
is destroyed. For example, where the probes are 
oligonucleotides and the target is a polynucleotide, a heating 
and low salt wash will often be sufficient to disrupt the 
interactions. Additional reagents may be added such as 
detergents, and organic or inorganic solvents which disrupt the 
interaction between the specific reagents and target. In an 
embodiment where the specific reagents are antibodies, the 
substrate may be exposed to a gentle detergent which will 
denature the specific binding between the antibody and its 
target. The conditions are selected to avoid severe disruption 
or destruction of the structure of the antibody and to maintain 
the specificity of the antibody binding site. Conditions with 
specific pH, detergent concentration, salt concentration, ionic 
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concentration, and other parameters "may be selected which 
disrupt the specific interactions. 



B. Storage and Preservation 
. As indicated above, the matrix will typically be 
maintained under conditions where the matrix itself and the 
linkages and specific reagents are preserved* Various specific 
preservatives may be added which prevent degradation* For 
example, if the reagents are acid or base labile, a neutral pH 
buffer will typically be added* It is also desired to avoid 
destruction of the matrix by growth of organisms which may 
destroy organic reagents attached thereto. For this, reason, a 
preservative such as cyanide or azide may be added. However, 
the chemical preservative should also be selected to preserve 
the chemical nature of the linkages and other components of the 
substrate. Typically, a detergent may also be included. 

C Processes to Avoid Degradation of Oligomers 
In particular, a substrate comprising a large number 
of oligomers will be treated in a fashion which is known to 
maintain the quality and integrity of oligonucleotides. These 
include storing the substrate in a carefully controlled 
environment under conditions of lower temperature, cation 
depletion (EDTA and EGTA) , sterile conditions, and inert argon 
or nitrogen atmosphere. 

XIII. INTEGRATED SEQUENCING STRATEGY 

A. Initial Mapping Strategy 

As indicated above, although the VLSIPS may be 
applied to sequencing embodiments, it is often useful to 
integrate other concepts to simply the sequencing. For 
example, nucleic acids may be easily sequenced by careful 
selection of the vectors and hosts used for amplifying and 
generating the specific target sequences. For example, it may 
be desired to use specific vectors which have been designed to 
interact most efficiently with the VLSIPS substrate. This is 
also important in fingerprinting and mapping strategies. For 
example, vectors may be carefully selected having particular 
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complementary sequences which are designed to attach to a 
genetic or specific oligomer on the substrate. This is also 
applicable to situations where it is desired to target 
particular sequences to specific locations on the matrix. 

In one embodiment, unnatural oligomers may be used to 
target natural probes to specific locations on the VLSIPS 
substrate. In addition, particular probes may be generated for 
the mapping embodiment which are designed to have specific 
combinations of characteristics. For example, the construction 
of a mapping substrate may depend upon use of another automated 
apparatus which takes clones isolated from a chromosome walk 
and attaches them individually or in bulk to the VLSIPS 
substrate. 

In another embodiment, a variety of specific vectors 
having known and particular "targeting" sequences adjacent the 
cloning sites may be individually used to clone a selected 
probe, and the isolated probe will then be targetable to a site 
on the VI^SIPS substrate with a sequence complementary to the 
"target" sequence . 

B. Selectio n of Smaller Clones 

In the fingerprinting and mapping embodiments, the 
selection of probes may be very important. Significant 
mathematical analysis may be applied to determine which 
specific sequences should be used as those probes. Of course, 
for fingerprinting use, these sequences would be most desired 
that show significant heterogeneity across the human 
population. Selection of the specific sequences which would 
most favorably be utilized will tend to be single copy 
sequences within the genome. 

Various hybridization selection procedures may be 
applied to select sequences which tend not to be repeated 
within a genome, and thus would tend to be conserved across 
individuals. For example, hybridization selections may be made 
for non-repetitive and single copy sequences. See, e.g. , 
Britten and Kohne (1968) "Repeated Sequences in DNA," Science 
161:529-540. On the other hand,/ it may be desired under 
certain circumstances to use repeated sequences. For example, 
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where a fingerprint may be used to "identify or distinguish 
different species f or where repetitive sequences may be 
diagnostic of specific species, repetitive sequences may be 
desired for inclusion in the fingerprinting probes. In either 
case, the sequencing capability will greatly assist in the 
selection of appropriate sequences to be used as probes. 

Also as indicated above, various means for* 
constructing an appropriate substrate may involve either 
mechanical or automated procedures. The standard VLSIPS 
automated procedure involves synthesizing oligonucleotides or 
short polymers directly on the substrate. In various other 
embodiments, it is possible to attach separately synthesized 
reagents onto the matrix in an ordered array. Other 
circumstances may lend themselves to transfer a pattern from a 
petri plate onto a solid substrate. Also, there are methods 
for site specifically .directing collections of reagents to 
specific locations using unnatural nucleotides or equivalent 
sorts of targeting molecules. 

While a brute force manual transfer process may be 
utilized sequentially attaching various samples to successive 
positions, instrumentation for automating such procedures may 
also be devised. The automated system for performing such 
would preferably be relatively easily designed and conceptually 
easily understood. 

XIV. COMMERCIAL APPLICATIONS 

A. Sequencing 

As indicated above, sequencing may be performed 
either de novo or as a verification of another sequencing 
method. The present hybridization technology provides the 
ability to sequence nucleic acids and polynucleotides de novo, 
or as a means to verify either the Maxam and Gilbert chemical 
sequencing technique or Sanger and Coulson dideoxy- sequencing 
techniques. The hybridization method is useful to verify 
sequencing determined by any other sequencing technique and to 
closely compare two similar sequences, e.g., to identify and 
locate sequence differences. 
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Besides polynucleotide sequencing, the present 
invention also provides means for sequencing other polymers. 
This includes polypeptides, carbohydrates, synthetic organic 
polymers , and other polymers • Again, the sequencing may be 
either verification or de novo. 

Of course, sequencing of can be very important in 
many different sorts of environments. For example, it will be 
useful in determining the genetic sequence of particular 
markers in various individuals. In addition, polymers may be 
used as markers or for information containing molecules to 
encode information. For example, a short polynucleotide 
sequence may be included in large bulk production samples 
indicating the manufacturer, date, and location of manufacture 
of a product. For example, various drugs may be encoded with 
this information with a small number of molecules in a batch. 
For example, a pill may have somewhere from 10 to 100 to 1,000 
or more very short and small molecules encoding this 
information. When necessary, this information may be decoded 
from a sample of the material using a polymerase chain reaction 
(PCR) or other amplification method. This encoding system may 
be used to provide the origin of large bulky samples without 
significantly affecting the properties of those samples. For 
example, chemical samples may also be encoded by this method 
thereby providing means for identifying the source and 
manufacturing details of lots. The origin of bulk hydrocarbon 
samples may be encoded. Production lots of organic compounds 
such as benzene or plastics may be encoded with a short 
molecule polymer. Food stuffs may also be encoded using 
similar marking molecules. Even toxic waste samples can be 
encoded determining the source or origin. In this way, proper 
disposal can be traced or more easily enforced. 

Similar sorts of encoding may be provided by 
fingerprinting-type analysis. Whether the resolution is 
absolute or less so, the concept of coding information on 
molecules such as nucleic acids, which can be amplified and 
later decoded, may be a very useful and important application. 

This technology also provides the ability to include 
markers for origins of biological materials. For example, a 

164 



patented animal line may be transformed with a particular 
unnatural sequence which can be traced back to its origin* 
With a selection of multiple markers, the likelihood could be 
negligible that a combination of markers would have 
independently arisen from a source other than the patented or 
specifically protected source. This technique may provide a 
means for tracing the actual origin of particular biological 
materials. BactLexia, plants, and animals will be subject to 
marking by such encoding sequences. 

" " B. Fingerprinting 

As indicated above, fingerprinting technolpgy may 
also be used for data encryption. Moreover, fingerprinting 
allows for significant identification of particular 
individuals. Where the fingerprinting .technology is 
standardized, and used for identification of large numbers of 
people, related equipment and peripheral processing will be 
developed to accompany the underlying technology. For example, 
specific equipment may be developed for automatically taking a 
biological sample and generating or amplifying the information 
molecules within the sample to be used in fingerprinting 
analysis. Moreover, the fingerprinting substrate may be mass 
produced using particular types of automatic equipment. 
Synthetic equipment may produce the entire matrix 
simultaneously by stepwise synthetic methods as provided by the 
VISITS technology. The attachment of specific probes onto a 
substrate may also be automated, e.g., making use of the caged 
biotin technology. See, e.g., U.S. S.N. 07/612,671 (caged 
biotin CIP) . As indicated above, there are automated methods 
for actually generating the matrix and substrate with distinct 
sequence reagents positionally located at each of the matrix 
positions. Where such reagents are, e.g., unnatural amino 
acids, a targeting function may be utilized which does not 
interfere with aa natural nucleotide functionality. 

In addition, peripheral processing may be important 
and may be dedicated to this specific application. Thus, 
automated equipment for producing the substrates may be 
designed, or particular systems which take in a biological 
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sample and output: either a computer -readout or an encoded 
instrument , e.g. , a card or document which indicates the 
information and can provide that information to others . An 
identification having a short magnetic strip with a few million 
bits may be used to provide individual identification and 
important medical information useful in a medical emergency. 

In fact, data banks may be set up to correlate all of 
this information of fingerprinting with medical information. 
This may allow for the determination of correlations between 
various medical problems and specific DNA sequences. By 
collating large populations of medical records with* genetic 
information, genetic propensities and genetic susceptibilities 
to particular medical conditions may be developed. Moreover, 
with standardization of substrates, the micro encoding data may 
be also standardized to reproduce the information from a 
centralized data bank or on an encoding device carried on an 
individual person. On the other hand, if the fingerprinting 
procedure is sufficiently quick and routine, every hospital may 
routinely perform a fingerprinting operation and from that 
determine many important medical parameters for an individual. 

In particular industries, the VLSIPS sequencing, 
fingerprinting, or mapping technology will be particularly 
appropriate. As mentioned above, agricultural livestock 
suppliers may be able to encode and determine whether their 
particular strains are being used by others. By incorporating 
particular markers into their genetic stocks, the markers will 
indicate origin of genetic material. This is applicable to 
seed producers, livestock producers, and other suppliers of 
medical or agricultural biological materials. 

This may also be useful in identifying individual 
animals or plants. For example, these markers may be useful in 
determining whether certain fish return to their original 
breeding grounds, whether sea turtles always return to their 
original birthplaces", or to determine the migration patterns 
and viability of populations of particular endangered species. 
It would also provide means for tracking the sources of 
particular animal products. For example, it might be useful 
for determining the origins of controlled animal substances 
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such as elephant ivory or particular bird populations whose 
importation or exportation is controlled* 

As indicated above, polymers may be used to encode 
important information on .source and batch and supplier. This 
is described in greater detail, e.g., "Applications of PCR to 
industrial problems," (1990) in Chemical and Engineering News 
68:145, which is hereby incorporated herein by reference. in 
fact, the synthetic method can be applied to the storage of 
enormous amounts of information. Small substrates may encode 
enormous amounts of information, and its recovery will make use 
of the inherent replication capacity. For example, - on regions 
of 10 *im x 10 £im, 1 cm 2 has 10 6 regions. An theory,, the entire 
human genome could be attached in 1000 nucleotide segments on a 
3 cm 2 surface. Genomes of endangered species may be stored on 
these substrates. 

Fingerprinting may also be used for genetic tracing 
or for identifying individuals for forensic science purposes. 
See, e.g., Morris, J. et al . (1989) "Biostatistical Evaluation 
of Evidence From continuous Allele Frequency Distribution DNA 
Probes in Reference to Disputed Paternity and Identity, " J . 
Forensic Science 34:1311-1317, and references provided therein; 
each of which is hereby incorporated herein by reference. 

In addition, the high resolution fingerprinting 
allows the distinguishability to high resolution of particular 
samples. As indicated above, new cell classifications may be 
defined based on combinations of a large number of properties. 
Similar applications will be found in distinguishing different 
species of animals or plants. In fact, microbial 
identification may become dependent on characterization of the 
genetic content. Tumors or other cells exhibiting abnormal 
physiology will be detectable by use of the present invention. 
Also, knowing the genetic fingerprint of a microorganism may 
provide very useful information on how to treat an infection by 
such organism. 

Modifications of the fingerprint embodiments may be 
used to diagnose the condition of the organism. For example, a 
blood sample is presently used for diagnosing any of a number 
of different physiological conditions. A multi-dimensional 
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f ingerprinting method made available by the present invention 
could become a routine means for diagnosing an enormous number 
of physiological features simultaneously* This may 
revolutionize the practice of medicine in providing information 
5 on an enormous number of parameters together at one time. In 
another way, the genetic predisposition may also revolutionize 
the practice of /medicine providing a physician with the ability 
to predict the likelihood of particular medical conditions 
arising at any particular moment. It also provides the ability 
10 to apply preventative medicine. 

* The present invention might also find application in 
use for screening new drugs and new reagents which -may be very 
important in medical diagnosis or other applications. For 
example r a description of generating a population of monoclonal 
15 antibodies with defined specificities may be very useful for 
producing various drugs or diagnostic reagents. 

Also available are kits with the reagents useful for 
performing sequencing, fingerprinting, and mapping procedures. 
The kits will have various compartments with the desired 
20 necessary reagents, e.g., substrate, labeling reagents for 
target samples, buffers, and other useful accompanying 
products . 

C Mapping 

2 5 The present invention also provides the means for 

mapping sequences within enormous stretches of sequence. For 
example, nucleotide sequences may be mapped within enormous 
chromosome size sequence maps. For example, it would be 
possible to map a chromosomal location within the chromosome 

3 0 which contains hundreds of millions of nucleotide base pairs. 

In addition, the mapping and fingerprinting embodiments allow 
for testing of chromosomal translocations, one of the standard 
problems for which amniocentesis is performed. 

Thus, the present invention provides a powerful tool 
3 5 and the means for performing sequencing, fingerprinting, and 
mapping functions on polymers. Although most easily and 
directly applicable to polynucleotides, polypeptides, 
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carbohydrates, and other sorts of molecules can be 
advantageously utilized using the present technology • 

The present invention will be better understood by 
reference to the following illustrative examples. The 
following examples are offered by way of illustration and not 
by. way of limitation. 
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EXPERIMENTAL 
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I. sequencing 

A. polynucleotide 

B. polypeptide 

C . short peptide 
1. Herz antibody identification 

II. Fingerprinting 

A. polynucleotide fingerprint 

B. ' peptide fingerprint 

C. cell classification scheme 

D. temporal development scheme 

1. developmental antigens 

2. developmental mRNA expression 
15 "" E-. diagnostic test 

~" ' ' 1. viral identification 

2*. bacterial identification 

3. other microbiological identifications 
4* allergy test (immobilized antigens) 

F. individual (animal/plant) identification 

1. ' genetic 

2 . immunological 

G. genetic screen 

1. test alleles with markers 
25 2. amniocentesis 

III. A? PP1 positionally located clones (caged biotin) 

1. short probes, long targets 

2. long targets, short probes 
30 b. positionally defined clones 

IV. Conclusion 
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* * * 



Relevant applications whose techniques are 
incorporated herein by reference are Pirrung, et al. , U.S. S.N. 
07/362,901 (VLSIPS parent), filed June 7, 1989; Pirrung et al , 
U.S. S.N. 07/492,462 (VLSIPS CIP) , filed March 7 , 1990; Barrett, 
40 et al., U.S. S.N. 07/435,316 (caged biotin) filed November 13, 
1989; Barrett, et al. , U.S. S.N. 07/612,671 (caged biotin CIP), 
filed November 13, 1990; and commonly assigned and 

simultaneously filed applications U. S.S.N. _J , , attorney 

docket number 11509-28 (automated VLSIPS) and U.S. S.N. 
45 / f t attorney docket number 11509-26 (sequencing by 

synthesis) . 

Also, additional relevant techniques are described, 
e.g., in Sambrook, J., et al. (1989) Molecular Cloning: a 
T,K.r a tnrv Manual . 2d Ed., vols 1-3, Cold Spring Harbor Press, 
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New York; Greenstein and Winitz (1961) Chemistry of the Amino 
Acids . Wiley and Sons, New York; Bodzansky, M. (19 88) Peptide 
Chemistry: a Practical Textbook f Springer-Verlag , New York; 
Harlow and Lane (1988) Antibodies: A Laboratory Manual f Cold 
Spring Harbor Press, New York? Glover, D. (ed.) (1987) DNA 
Cloning: A Practical Approach , vols 1-3, IRL Press, Oxford; 
Bishop and Rawlings (1987) Nucleic Acid and Protein Sequence 
Analysis: A Practical Approach . IRL Press, Oxford; Hames and 
Higgins (1985) Nucleic Acid Hybridisation: A Practical 
Approach , IRL Press, Oxford? Wu et al. (1989) Recombinant DNA 
Methodology , Academic Press, San Diego; Goding (1984) 
Monoclonal Antibodies: Principles and Practice . (2cLed.), 
Academic Press, San Diego; Finegold and Barron (1986) Bailey 
and Scott 1 s Diagnostic Microbiolocry . (7th ed.) r Mosby Co., St. 
Louis; Collins et al. (1989) Microbiological Methods . (6th 
ed.), Butterworth, London; Chaplin and Kennedy (1986) 
Carbohydrate Analysis: A Practical Approach * IRL Press, Oxford; 
Van Dyke (ed.) (1985) Bioluminescence and Chemiluminescence : 
Instruments and Applications , vol 1, CRC Press, Boca Rotan; and 
Ausubel et al. (ed.) (1990) Current Protocols in Molecular 
Biology , Greene Publishing and Wiley-Interscience, New York; 
each of which is hereby incorporated herein by reference. 

The following examples are provided to illustrate the 
efficacy of the inventions herein. All operations were 
conducted at about ambient temperatures and pressures unless 
indicated to the contrary. 

I . SEQUENCING 

A. Polynucleotide 

1. HPLC of the photolysis of 5 ' -O- 
nitroveratryl-thymidine. 

In order to determine the time for photolysis of 5'- 

O-nitrovertryl thymidine to thymidine a 100 /iM solution of NV- 

Thym-OH (5 1 -O-nitrovertryl thymidine) in dioxane was made and 

-200 m! aliguots were irradiated (in a quartz cuvette 1 cm x 2 

mm) at 362.3 nm for 20 sec, 40 sec, 60 sec, 2 min, 5 min, 10 

min, 15 min, and 20 min. The resulting irradiated mixtures 

were then analyzed by HPLC using a Varian MicroPak SP column 
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(C 18 analytical) at a flow rate of 1-ml/min and a solvent 
system of 40% CH 3 CN and 60* water. Thymidine has a retention 
time of 1.2 min and NVO-Thym-OH has a retention time of 2.1 
min* It was seen that after 10 min of exposure the 
deprotection was complete. 

2 . Preparation and Detection of Thymidine- 
' Cytidine dimer (FITC) 

The reaction is illustrated: 



i ^ 

1 



T 



— 5> 
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OH 
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0 "ST 
3) FITC 
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To an aminopropylated glass slide (standard VLSIPS) 
was added a mixture of the following: 

12.2 mg of KVO-Thym»C0 2 H (IX) 

3.4 mg of HOBT (N-hydroxybenztriazal) 
8.8 pi DIEA (Diisopropylethylamine) 

11.1 mg BOP reagent 

2.5 ml DMF 

After 2 h coupling time (standard VLSIPS) the plate 

was washed, acetylated with acetic anhydride/pyridine , washed, 
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dried, and photolyzed in dioxane at -362 ran at 14 mW/cm 2 for 10 
min using a 500 /im checkerboard mask. The slide was then taken 
and treated with a mixture of the following: 

107 mg of FMOC-amine modified C (III) 
5 21 mg of tetrazole 

1 ml anhydrous CH^CN 
After f being treated for approximately 8 min, the 
slide was washed off with CH 3 CN, dried, and oxidized with 
I 2 /H 2 0/THF/lutidine for 1 min. The slide was again washed, 
10 dried, and treated for 3 0 min with a 20% solution of DBU in 

DMF. After thorough rinsing of the slide, it was next exposed 
to a FITC solution (ImM fluorescein isothiocyanate JFITC] in 
DMF) for 50 min, then washed, dried, and examined by 
fluorescence microscopy. This reaction is illustrated: 
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3. Preparation and" -Detection of Thymidine- 
Cytidine dimer (Biotin) 

An aminopropyl glass slide, was soaked in a solution 

of ethylene oxide (20% in DMF) to generate a hydroxylated 

surface. The slide was added a mixture of the following: 

32 mg of NVO-T-OCED (X) 

11 mg of tetrazole 

0.5 ml of anhydrous CH 3 CN 

After 8 min the plate was then rinsed, with 

acetonitrile, then oxidized with I 2 /H 2 0/THF/lutidine for 1 min, 

washed and dried. The slide was then exposed to a 1:3 mixture 

of acetic anhydride: pyridine for 1 h, then washed and dried. 

The substrate was a then photolyzed in dioxane at 362 nm at 14 

o 

mW/cm for 10 mm using a 500/im checkerboard mask, dried, and 
then treated with a -mixture of the following: 

65 mg of biotin modified C (IV) 

11 mg of tetrazole 
0.5 ml anhydrous CH 3 CN 
After 8 min the slide was washed with CH^CN then 
oxidized with I 2 /H 2 0/THF/lutidine for 1 min, washed, and then 
dried. The slide was then soaked for 30 min in a PBS/0.05% 
Tween 20 buffer and the solution then shaken off. The slide 
was next treated with FITC-labeled streptavidin at 10 pg/ml in 
the same buffer system for 3 0 min. After this time the 
streptavidin-buf f er system was rinsed off with fresh PBS/0.05% 
Tween 20 buffer and then the slide was finally agitated in 
distilled water for about 1/2 h. After drying, the slide was 
examined by fluorescence microscopy (see Fig. 2 and Fig. 3) . 

4. substrate preparation 
Before attachment of reactive groups it is preferred 
to clean the substrate which is, in a preferred embodiment, a 
glass substrate such as a microscope slide or cover slip* A 
roughened surface will be useable but a plastic or other solid 
substrate is also appropriate. According to one embodiment the 
slide is soaked in an alkaline bath consisting of, e.g., 
1 liter of 95% ethanol with 120 ml of water and 120 grams of 
sodium hydroxide for 12 hours. The slides are washed with a 
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buffer and under running water, allowed to air dry, and rinsed 
with a solution of 95% ethanol. 

The slides are then aminated with, e.g. , 
aminopropyltriethoxysilane for the purpose of attaching amino 
groups to the glass surface on linker molecules , although other 
omega functionalized silanes could also be used for this 
purpose. In OT>e embodiment 0.1% aminopropyltriethoxysilane is 
utilized, although solutions with concentrations from 10~ 7 % to 
10% may be used, with about 10~ 3 % to 2% preferred. A 0.1% 
mixture is prepared by adding to 100 ml of a 95% 
ethanol/5% water mixture, 100 microliters (jil) of "* 
aminopropyltriethoxysilane. The mixture is agitated at about 
ambient temperature on a rotary shaker for an appropriate 
amount of time, e.g., about 5 minutes. 500 ^1 of this mixture 
is then applied to the surface of one side of each cleaned 
slide. After 4 minutes or more, the slides are decanted of 
this solution and thoroughly rinsed three times or more by 
dipping in 100% ethanol. 

After the slides dry, they are heated in a 110-120 *C 
vacuum oven for about 20 minutes, and then allowed to cure at 
room temperature for about 12 hours in an argon environment. 
The slides are then dipped into DMF (dimethylf ormamide) 
solution, followed by a thorough washing with methylene 
chloride. 

5. linker attachment, blocking of free sites 
The aminated surface of the slide is then exposed to 
about 500 jxl of, for example, a 30 millimolar (mM) solution of 
NVOC-nucleotide- NHS (N-hydroxysuccinimide) in DMF for 
attachment of a NVOC-nucleotide to each of the amino groups. 
See, e.g., SIGMA Chemical Company for various nucleotides 
derivatives. The surface is washed with, for example, DMF, 
methylene chloride, and ethanol. 

Any unreacted aminopropyl silane on the surface, 
i.e., those amino groups which have not had the NVOC-nucleotide 
attached, are now capped with acetyl groups (to prevent further 
reaction) by exposure to a 1:3 mixture of acetic anhydride in 
pyridine for 1 hour. Other materials which may perform this 
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residual capping function include trif luoroacetic anhydride, 
formicacetic anhydride, or other reactive acylating agents. 
Finally , the slides are washed again with DMF, methylene 
chloride, and ethanol . 

6. synthesis of eight trimers of C and T 
Fig* 4 ^illustrates a possible synthesis of the eight 
t rimers of the two-monomer set: cytosine and thymine 
(represented by C and T, respectively) . A glass slide bearing 
silane groups terminating in 6-nitroveratryloxycarboxamide 
(NVOC-NH) residues is prepared as a substrate. Active esters 
(pentaf luorophenyl , OBt, etc.) of cytosine and thymine 
protected at the 5 1 hydroxy 1 group with NVOC are prepared as 
reagents. While not pertinent to this example, if side chain 
protecting groups are required for the monomer set, these must 
not be photoreactive at the wavelength of light used to 
protect the primary chain. 

For a monomer set of size n, n x I cycles 
are required to synthesize all possible sequences of length £. 
A cycle consists of: 

1. Irradiation through an appropriate mask 

to expose the 5* -OH groups at the sites where 
the next residue is to be added, with appro- 
priate washes to remove the by-products of the 
deprotection. 

2. Addition of a single activated and protected 
(with the same photochemically-removable group) 
monomer, which will react only at the sites 
addressed in step 1, with appropriate washes to 
remove the excess reagent from the surface. 

The above cycle is repeated for each member of the 
monomer set until each location on the surface has been 
extended by one residue in one embodiment. In other 
embodiments, several residues are sequentially added at one 
location before moving on to the next location. Cycle times 
will generally be limited by the coupling reaction rate, now as 
short as about 10 min in automated oligonucleotide 
synthesizers. This step is optionally followed by addition of 
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Of course, greater diversity is obtained by using 
masking strategies which will also include the synthesis of 
polymers having a length of less than £. If, in the extreme 
case, all polymers having a length less than or equal to t are 
synthesized, the number of polymers synthesized will be: 

n l + n 1 ' 1 + . + n 1 . (3) 
The maximum number of lithographic steps needed will 
generally be n 'for each "layer" of monomers, i.e., the total 
number of masks (and, therefore, the number of lithographic 
steps) needed will be n x £. The size of the transparent mask 
regions' will vary in accordance with the area of the substrate 
available for synthesis and the number of sequences to be 
formed. In general, the size of the synthesis areas will be: 
size of synthesis areas = (A)/(S) 

where: 

A is the total area available for synthesis; -and 
S is the number of sequences desired in the area. 

It will be appreciated by those of skill in the art 
that the above method could readily be used to simultaneously 
produce thousands or millions of oligomers' on a substrate using 
the photolithographic techniques disclosed herein. * 
Consequently, the method results in the ability. to practically 
test large numbers of, for example, di, tri f tetra, penta , 
hexa, hepta, octa, nona, deca, even dodecanucleotides , 
or larger polynucleotides (or correspondingly, polypeptides) . 

The above example has illustrated the method by way 
of a manual example. It will of course be appreciated that 
automated or semi-automated methods could be used. The 
substrate would be mounted in a flow cell for automated addi- 
tion and removal of reagents, to minimize the volume of 
reagents needed, and to more carefully control reaction 
conditions. Successive masks will be applicable manually or 
automatically. See r e.g., U.S. S.N. 07/492,462 (VLSIPS CIP) and 

U.S. S.N. / , , attorney docket number 11509-28 (automated 

VLSIPS) . 

7. labeling of target 
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The target oligonucleotide can be labeled using 
standard procedures referred to above. As discussed, for 
certain situations, a reagent which recognizes interaction, 
e.g. , ethidium bromide, may be provided in the detection step, 
Alternatively, fluorescence labeling techniques may be applied, 
see, e.g., Smith, et al. (1986) Nature , 321: 674-679; and 
Prober, et al. ^(1987) Science , 238:336-341. The techniques 
described therein will be followed with minimal modifications 
as appropriate for the label selected. 

8. dimers of A, C, G, and T 

The described technique may be applied, with 
photosensitive blocked nucleotides corresponding to adenine, 
cytosine, guanine, and thymine, to make combinations of 
polynucleotides consisting of each of the four different 
nucleotides. All 16 possible dimers would be made using a 
minor modification of the described method. 

9. 10-mers of A, C, G, and T 

The described technique for making dimers of A, C, G, 
and T may be further extended to make longer oligonucleotides. 
The automated system described, e.g., in U.S.S.N 07/492,462 

(VLSIPS CIP) , and U.S.S.N. / , , attorney docket number 

11509-28 (automated VLSIPS) , can be adapted to make all 
possible 10-mers composed of the 4 nucleotides A, C, G, and T. 
The photosensitive, blocked nucleotide analogues have been 
described above, and would be readily adaptable to longer 
oligonucleotides . 

10. specific recognition hybridization to 10- 
mers 

The described hybridization conditions are directly 

applicable to the sequence specific recognition reagents 

attached to the substrate, produced as described immediately 

above. The 10-mers have an inherent property of hybridizing to 

a complementary sequence. For optimum discrimination between 

full matching and some mismatch, the conditions of 

hybridization should be carefully selected, as described above. 

Careful control of the conditions, and titration of parameters 
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should be performed to determine the optimum collective 
conditions . 

11 . hybridization 
Hybridization conditions are described in detail, 
e.g., in Hames and Higgins (1985) Nucleic Acid Hybridisation: 
A Practical Approach : and the considerations for selecting 
particular conditions are described, e.g., in Wetmur and 
Davidson, (1988) J. Mol . Biol. 31:349-370, and Wood et al. 
(1985) Froc. Natl. Acad. Sci> USA 82:1585-1588. As described 
above, conditions are desired which can distinguish matching 
along the entire length of the probe from where there is one or 
more mismatched bases. The length of incubation and conditions 
will be similar, in many respects, to the hybridization 
conditions used in Southern blot transfers. Typically, the GC 
bias may be minimized by the introduction of appropriate 
concentrations of the alkylammonium buffers, as described 
above. 

Titration of the temperature and other parameters is 
desired to determine the optimum conditions for specificity and 
distinguishability of absolutely matched hybridization from 
mismatched hybridization. 

A f luorescently labeled target or set of targets are 
generated, as described in Prober, et al. (1987) Science 
238:336-341, or Smith, et al. (1986) Nature 321:674-679. 
Preferably, the target or targets are of the same length as, or 
slightly longer, than the oligonucleotide probes attached to 
the substrate and they will have known sequences. Thus, only a 
few of the probes hybridize perfectly with the target, and 
which particular ones did would be known. 

The substrate and probes are incubated under 
appropriate conditions for a sufficient period of time to allow 
hybridization to completion. The time is measured to determine 
when the probe-target hybridizations have reached completion. 
A salt buffer which minimizes GC bias is preferred, 
incorporating, e.g., buffer, such as tetramethyl ammonium or 
tetraethyl ammonium ion at between about 2.4 and 3.0 M. See 
Wood, et al. (1985) Proc. Nat'l Acad. Sci. USA 82:1585-1588. 
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This time is typically at least about 30 rain, and may be as 
long as about 1-5 days* Typically very long matches will 
hybridize more quickly , very short matches will hybridize less 
quickly, depending upon relative target and probe 
concentrations* The hybridization will be performed under 
conditions where the reagents are stable for that time 
duration. f 

Upon maximal hybridization, the conditions for 
washing are titrated. Three parameters initially titrated are 
time, temperature, and cation concentration of the wash step. 
The matrix is scanned at various times to determine the 
conditions at which the distinguishability between-.true perfect 
hybrid and mismatched hybrid is optimized. These conditions 
will be preferred in the sequencing embodiments. 

12. positional detection of specific 
interaction - 

As indicated above, the detection of specific 
interactions may be performed by detecting the positions where 
the labeled target sequences are attached. Where the label is 
a fluorescent label, the apparatus described, e.g., in U.S. S.N. 

07/492,462 (VLSIPS CIP) ; and U.S. S.N. / , attorney 

docket number 11509-28, may be advantageously applied. In 
particular, the synthetic processes described above will result 
in a matrix pattern of specific sequences attached to the 
substrate, and a known pattern of interactions can be converted 
to corresponding sequences. 

In an alternative embodiment, a separate reagent 
which differentially interacts with the probe and interacted 
probe/targets can indicate where interaction occurs or does not 
occur. A single-strand specific reagent will indicate where no 
interaction has taken place, while a double-strand specific 
reagent will indicate where interaction has taken place. An 
intercalating dye,' e.g., ethidium bromide, may be used to 
indicate the positions of specific interaction. 

13. analysis 

Conversion of the positional data into sequence 
specificity will provide " t ^ e j|^ t of subsequences whose analysis 



by overlap segments, may be performed, as described above. 
Analysis is provided by the methodology described above, or 
using, e.g., software available from the Genetic Engineering 
Center, P.O. Box 794, 11000 Belgrade, Yugoslavia (Yugoslav 
group). See,, also, Macevicz , PCT publication no. V?o 90/04652, 
which is hereby incorporated herein by reference. 

B. Polypeptide 

The description of the preparation of short peptides 
on a substrate incorporates by reference sections in U.S. S.N. 
07/492,462 (VLSIPS CIP) , and described below. 

1. slide preparation 

Preparation of the substrate follows that described 
above for nucleotides. 

2. linker attachment, blocking of free sites 
The aminated surface of the slide is exposed to about 

500 |il of, e.g. , a 30 millimolar (mM) solution of NVOC-GABA 
(gamma amino butyric acid) NHS (N-hydroxysuccinimide) in DMF 
for attachment of a NVOOGABA to each of the amino groups. The 
surface is washed with, for example, DMF, methylene chloride, 

and ethanol. See U.S. S.N. / , , attorney docket number 

11509-28, for details on amino, acid chemistry. 

Any unreacted aminopropyl silane on the surface, 
i.e., those amino groups which have not had the NVOC-GABA 
attached, are now capped with acetyl groups (to prevent further 
reaction) by exposure to a 1:3 mixture of acetic anhydride in 
pyridine for 1 hour. Other materials which may perform this 
residual capping function include trif luoroacetic anhydride, 
formicacetic anhydride, or other reactive acylating agents. 
Finally, the slides are washed again with DMF, methylene 
chloride, and ethanol. 

3. synthesis of 8 trimers of "A" and "B" 

See U.S. S.N. 07/492,462 (VLSIPS CIP) which describes 
the preparation of glycine and phenylalanine trimers. The 
technique is similar to the method described above for making 
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trineirs of C and T f but substituting photosensitive blocked 
glycine for the C derivative and photosensitive blocked 
phenylalanine for the T derivative. 

4. synthesis of a dimer of an aminopropyl 
group and a fluorescent group 

In synthesizing the dimer of an aminopropyl group and 
a fluorescent group , a f unctionalized durapore membrane was 
used as a substrate. The Durapore membrane was a 
polyvinylidine difluoride with aminopropyl groups* The 
aminopropyl groups were protected with the DDZ group by 
reaction of the carbonyl chloride with the amino groups , a re- 
action readily known to those of skill in the art."^ The surface 
bearing these groups was placed in a solution of THF and 
contacted with a mask bearing a checkerboard pattern of 1 mm 
opaque and transparent regions. The mask was exposed to 
ultraviolet light having a wavelength down to at least about 
280 nm for about 5 minutes at ambient temperature, although a 
wide range of exposure times and temperatures may be 
appropriate in various embodiments of the invention. 
For example , in one embodiment, an exposure time of between 
about 1 and 5000 seconds may be used at process temperatures of 
between -70 and +50 *C. 

In one preferred embodiment, exposure times of 
between about 1 and 500 seconds at about ambient pressure are 
used. In some preferred embodiments, pressure above ambient is 
used to prevent evaporation. 

The surface of the membrane was then washed for about 
1 hour with a fluorescent label which included an active ester 
bound to a chelate of a lanthanide. Wash times will vary over 
a wide range of values from about a few minutes to a few hours. 
These materials fluoresce in the red and the green visible 
region. After the reaction with the active ester in the 
fluorophore was complete, the locations in which the 
fluorophore. was bound could be visualized by exposing them to 
ultraviolet light and observing the red and the green 
fluorescence. it was observed that the derivatized regions of 
the substrate closely corresponded to the original pattern 
of the mask. 
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5- demonstration of signal capability 
Signal detection capability was demonstrated using a 
low* level standard fluorescent bead kit manufactured by Flow 
Cytometry Standards and having model no. 824. This kit 
includes 5.8 jx& diameter beads, each impregnated with a known 
number of fluorescein molecules. 

One of the beads was placed in the illumination field 
on the scan stage in a field of a laser spot which was 
initially shuttered. After being positioned in the 
illumination field, the photon* detection eguipment^was turned 
on. The laser beam was unblocked and it interacted with the 
particle bead, which then fluoresced. Fluorescence curves of 
beads impregnated with 7,000 and 29,000 fluorescein molecules, 
are shown in Figs. 11A and 11B, respectively of U.S. S.N- 
07/492,462 (VLSIPS CIP) . On each curve, traces for beads 
without fluorescein molecules are also shown. These 
experiments were performed with 4 88 nm excitation, with 100 /iW 
of laser power. The light was focused through a 4 0 power 0.75 
NA objective. 

The fluorescence intensity in all cases , started off 
at a high value and then decreased exponentially. The fall- 
off in intensity is due to photobleaching of the fluorescein 
molecules. The traces of beads without fluorescein molecules 
are used for background subtraction. The difference in the 
initial exponential decay between labeled and nonlabeled beads 
is integrated to give the total number of photon counts, and 
this number is related to the number of molecules per bead. 
Therefore, it is possible to deduce the number of photons per 
fluorescein molecule that can be detected. This calculation 
indicates the radiation of about 40 to 50 photons per 
fluorescein molecule are detected. 

6. determination of the number of molecules 
per unit area 

Aminopropylated glass microscope slides prepared 

according to the methods discussed above were utilized in order 

to establish the density of labeling of the slides. The free 

amino termini of the slides were reacted with FITC (fluorescein 
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isothiocyanate) which forms a coval'ent linkage with the amino 
group. The slide is then scanned to count the number of 
fluorescent photons generated in a region which , using the 
estimated 4 0-50 photons per fluorescent molecule, enables the 
calculation of the number of molecules which are on the surface 
per unit area. 

A slide with aminopropyl silane on its surface was 

r 

immersed in a 1 mM solution of FITC in DMF for 1 hour at about 
ambient temperature. After reaction, the slide was washed 
twice with DMF and then washed with ethanol, water, and then 
ethanol again. It was then dried and stored in the-dark until 
it was ready to be examined. 

Through the use of curves similar to those shown in 
Fig- 11 of U.S. S.N. 07/492,462 (VI^IPS CIP) , and by integrating 
the fluorescent counts under the exponentially decaying signal, 
the number of free amino groups on the surface after 
derivitization was determined. It was determined that slides 
with labeling densities of 1 fluorescein per 10 3 xlQ 3 to -2x2 .nm 
could be reproducibly made as the concentration of 
aminopropyltriethoxysilane varied from 10" 5 % to -10~ 1 %. 

7. removal of NOVC and attachment of a 
fluorescent marker 

NVOC-GABA groups were attached as described above. 
The entire surface of one slide was exposed to light so as to 
expose a free amino group at the end of the gamma amino butyric 
acid. This slide, and a duplicate which was not exposed, were 
then exposed to fluorescein isothiocyanate (FITC) . 

Fig. 12A of U.S. S.N. 07/492,462 (VLSIPS CIP) 
illustrates the slide which was not exposed to light, but which 
was exposed to FITC. The units of the x axis are time and the 
units of the y axis are counts. -The trace contains a certain 
amount of background fluorescence. The duplicate slide was 
exposed to 350 nm broadband illumination for about 1 minute 

2 

(12 mW/cm , -3 50 nm illumination) , washed and reacted 
with FITC. A large increase in the level of fluorescence is 
observed, which indicates photolysis has exposed a number of 
amino groups on the surface of the slides for attachment of a 
fluorescent marker. 
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8 . use of a mask in removal of NVOC 
The next experiment was performed with a 0.1% 
aminopropylated slide. Light from a Hg-Xe arc lamp was imaged 
onto the substrate through a laser-ablated chrome-on-glass mask 
in direct contact with the substrate. 

This slide was illuminated for approximately 5 
minutes, with 12 mW of 350 nm broadband light and then reacted 
with the 1 mM FITC solution. It was put on the laser detection 
scanning stage and a graph was plotted as a two-dimensional 
representation of position color-coded for fluorescence 
intensity. The experiment was repeated a number of times 
through various masks. The fluorescence patterns for a 100x100 
/xm mask, a 50 ^m mask, a 20 jim mask, and a 10 Mm mask indicate 
that the mask pattern is distinct down to at least about 10 fim 
squares using this lithographic technique. 

9. attachment of YGGFL and subsequent exposure 
to herz antibody and goat anti-mouse 
antibody 

In order to establish that receptors to a particular 
polypeptide sequence would bind to a surface-bound peptide and 
be detected , Leu enkephalin was" coupled to the surface and 
recognized by an antibody. A slide was derivatized with 0.1% 
amino propyl-triethoxysilane and protected with NVOC . A 500 ;xm 
checkerboard mask was used to expose the slide in a flow cell 
using backside contact printing. The Leu enkephalin sequence 
(H 2 N- tyrosine , glycine , glycine , phenylalanine , leucine-COOH , 
otherwise referred to herein as YGGFL) was attached via its 
carboxy end to the exposed amino groups on the surface of the 
slide. The peptide was added in DMF solution with the 
BOP/HOBT/DIEA coupling reagents and recirculated through the 
flow cell for 2 hours at room temperature. 

A first antibody, known as the Herz antibody, was 
applied to the surface of the slide for 45 minutes at 2 /jtg/ml 
in a supercocktail (containing 1% BSA and 1% ovalbumin also 
in this case) . A second antibody, goat anti-mouse fluorescein 
conjugate, was then added at 2 pg/ml in the supercocktail 
buffer, and allowed to incubate for 2 hours. 
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The results of this experiment were plotted as 
fluorescence intensity as a function of position. This image 
was taken at 10 iim steps and showed that not only can 
deprotection be carried out in a well defined pattern, but also 
that (1) the method provided for successful coupling of 
peptides to the surface of the substrate, (2) the surface of a 
bound peptide was available for binding with an antibody, and 
(3) that the detection apparatus - capabilities were sufficient 
to detect binding of a receptor. Moreover, the Herz antibody 
is a sequence specific reagent which may be used advantageously 
as a secjuence specific recognition reagent. It may be used, if 
specificity is high, for sequencing purposes, and,- at least, 
for fingerprinting and mapping uses. 

10. " monomer-by-monomer formation of YGGFL and 
subsequent exposure to labeled antibody 

Monomer-by-monomer synthesis of YGGFL and GGFL in 

alternate squares was performed on a slide in a checkerboard 

pattern and the resulting slide was exposed to the • Herz 

antibody. 

A slide is derivatized with the aminopropyl group, 
protected in this case with t-BOC (t-butoxycarbonyl ) . The 
slide was treated with TFA to remove the t-BOC protecting 
group . E-aminocaprodc acid, which was t-BOC protected at its 
amino group, was then coupled onto the aminopropyl groups. 
The aminocaproic acid serves as a spacer between the 
aminopropyl group and the peptide to be synthesized. The amino 
end of the spacer was deprotected and coupled to NVOC-leucine . 
The entire slide was then illuminated with 12 mW of 3 25 niu 
broadband illumination. .The slide was then coupled with NVOC- 
phenylalanine and washed. The entire slide was again 
illuminated, then coupled to NVOC-glycine and washed. The 
slide was again illuminated and coupled to NVOC-glycine to form 
the sequence shown in the last portion of Fig. 13 A of U.S. S.N. 
07/492,462 (VLSIPS CIP) . 

Alternating regions of the slide were then 
illuminated using a projection print using a 500x500 ^m 
checkerboard mask; thus, the amino group of glycine was exposed 
only in the lighted areas. When the next coupling chemistry 
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step was carried out, NVOC-tyrosine -was added, and it coupled 
only at those spots which had received illumination . The 
entire slide was then illuminated to remove all the KVOC 
groups, leaving a checkerboard of YGGFL in the lighted areas 
and in the other areas, GGFL. The Herz antibody (which 
recognizes the YGGFL, but not GGFL) was then added, followed by 
goat anti-mouse fluorescein conjugate* 

The resulting fluorescence scan showed dark areas 
containing the tetrapeptide GGFL, which is not recognized by 
the Herz antibody (and thus there is no binding of the goat 
anti-mouse -antibody with fluorescein conjugate), and red areas 
in which YGGFL was present. The YGGFL pentapeptide is 
recognized by the Herz antibody and, therefore, there is 
antibody in the lighted regions for the f luorescein-conjugated 
goat anti-mouse to recognize. 

Similar patterns for a 50 jim mask used in direct 
contact ("proximity print") with the substrate provided a 
pattern which was more distinct and the corners of the 
checkerboard pattern were touching as a result of the mask 
being placed in direct contact with the substrate (which 
reflects the increase in resolution using this technique) , 

11. monomer-by-monomer synthesis of YGGFL and 
PGGFL 

A synthesis using a 50 checkerboard mask was 
conducted. However, P was added to the GGFL sites on the 
substrate through an additional coupling step. P was added by 
exposing protected GGFL to light through a mask, and 
subsequence exposure to P in the manner set forth above. 
Therefore, half of the regions on the substrate contained YGGFL 
and the remaining half contained PGGFL. 

The fluorescence plot for this experiment showed the 
regions are again readily discernable between those in which 
binding did and did not occur. This experiment demonstrated 
that antibodies are able to recognize a specific sequence and 
that the recognition is not length-dependent. 
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12. monomer-by-monomer synthesis 
of YGGFL and YPGGFL 

In order to further demonstrate the operability of 

the invention, a 50 jim checkerboard pattern of alternating 

YGGFL and YPGGFL was synthesized on a substrate using 

techniques like those set forth above* The resulting 

fluorescence plot showed that the antibody was clearly able t 

to recognize the YGGFL sequence and did not bind significantly 

at the YPGGFL regions, 

13, synthesis of an array of sixteen different 
amino acid sequences and estimation of 
relative binding affinity to herz antibody 

Using techniques similar to those set forth above, an 

array of 16 different amino acid sequences (replicated four 

times) was synthesized on each of two glass substrates. The 

sequences were synthesized by attaching the sequence NVOC-GFL 

across the entire surface of the slides. Using a series of 

masks, two layers of amino acids were then selectively applied 

to the substrate. Each region had dimensions of 0,25 cm x 

0.0625 cm. The first slide contained amino acid sequences 

containing only L- amino acids while the second slide contained 

selected D- amino acids. Various regions on the first and 

second slides, were duplicated four times on each. slide. The 

slides were then exposed to the Herz antibody and fluorescein- 

labeled goat anti-mouse antibodies. 

A fluorescence plot of the first slide, which 

contained only L- amino acids showed red areas (indicating 

strong binding, i.e., 149,000 counts or more) and black areas 

(indicating little or no binding of the Herz antibody, i.e., 

20,000 counts or less). The sequence YGGFL was clearly most 

strongly recognized. The sequences YAGFL and YSGFL also 

exhibited strong recognition of the antibody. By contrast, 

most of the remaining sequences showed little or no binding. 

The four duplicate portions of the slide were extremely 

consistent in the amount of binding shown therein. 

A fluorescence plot of the D- amino acid slide 

indicated that strongest binding was exhibited by the YGGFL 

sequence. Significant binding was also detected to YaGFL, 
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YsGFL. , and YpGFL. The remaining sequences showed less binding 
with the antibody. Low binding efficiency of the sequence 
yGGFL. was observed. 

Table 6 lists the various sequences tested in order 
of relative fluorescence, which provides information regarding 
relative binding affinity • 
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Table 6 - 
Apparent Bindi ng 1-0 Herz 



T.- a, a- Set. 



5 YGGFL 
YAGFL 
YSGFL 
LGGFL 
FGGFL 
10 YPGFL 
LAGFL 
FAGFL 
WGGFL 



15 



n- a . a 



Set 



YGGFL 

YaGFL 

YsGFL 

YpGFL 

fGGFL 

yGGFL 

faGFL. 

WGGFL 

yaGFL 

fpGFL 

waGFL 
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14. illustrative alternative embodiment 
According to' an alternative embodiment of the 
invention, the methods provide for attaching to the surface a 
caged binding member which, in its caged form, has a relatively 
low affinity for other potentially binding species, such as 
receptors and specific binding substances. Such techniques are 
more fully described in copending application Serial No. 
404,920, filed September 8, 1989, and incorporated herein by 
reference for all purposes. See also U.S. S.N. 07/435,316 
(caged biotin parent) and U.S. S.N. 07/612,671 (caged biotin 
CIP), each of which is hereby incorporated herein by reference. 

According to this alternative embodiment, the 
invention provides methods for forming predefined regions on a 
surface of a solid support, wherein the predefined regions are 
capable of immobilizing receptors. The methods make use of 
caged binding members attached to the surface to enable 
selective activation of the predefined regions. The caged 
binding members are liberated to act as binding members 
ultimately capable of binding receptors upon selective 
activation of the predefined regions. The activated binding 
members are then used to immobilize specific molecules such as 
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receptors on the predefined region of the surface. The above 
procedure is repeated at the same or different sites on the 
surface so as to provide a surface prepared with a plurality of 
regions on the surface containing, for example , the same or 
different receptors. When receptors immobilized in this way 
have a differential affinity for one or more ligands f 
screenings and assays for the ligands can be conducted in the 
regions of the surface containing the receptors. 

The alternative embodiment may make use of novel 
caged binding members attached to the substrate. Caged 
(unactivated) members have a relatively low affinity for 
receptors of substances that specifically bind to uncaged 
binding members when compared with the corresponding affinities 
of activated binding members. Thus, the binding members are 
protected from reaction until a suitable source of energy is 
applied to the regions of the surface desired to be activated. 
Upon application of a suitable energy source, the caging groups 
labilize, thereby presenting the activated binding member. A 
typical energy source will be light. 

Once the binding members on the surface are activated 
they may be attached to a receptor. The receptor chosen may be 
a monoclonal antibody, a nucleic acid sequence, a drug 
receptor, etc. The receptor will usually, though not always, 
be prepared so as to permit attaching it, directly or 
indirectly, to a binding member. For example, a specific 
binding substance having a strong binding affinity for the 
binding member and a strong affinity for the receptor or a 
conjugate of the receptor may be used to act as a bridge 
between binding members and receptors if desired. The method 
uses a receptor prepared such that the receptor retains its 
activity toward a particular ligand. 

Preferably, the caged binding member attached to the 
solid substrate will be a photoactivatable biotin complex, 
i.e., a biotin molecule that has been chemically modified with 
photoactivatable protecting groups so that it has a 
significantly reduced binding affinity for avidin or avidin 
analogs than does natural biotin. In a preferred embodiment, 
the protecting groups localized in a predefined region of the 
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surface will be removed upon application of a suitable source 
of radiation to give binding members, that are biotin or a 
functionally analogous compound having substantially the same 
binding affinity for avidin or avidin analogs as does biotin* 

5 In. another preferred embodiment, avidin or an avidin 

analog is incubated with activated binding members on the 
surface until the avidin binds strongly to the binding members. 
The avidin so immobilized on predefined regions of the surface 
can then be incubated with a desired receptor or conjugate of a 
10 desired receptor. The receptor will preferably be 

biotinylated, e.g., a biotinylated antibody, when avidin is 
immobilized on the predefined regions of the surface. 
Alternatively, a preferred embodiment will present an 
avidin/biotinylated receptor complex, which has been previously 

15 prepared, to activated binding members on the surface. 

II. FINGERPRINTING 

The above section on generation of reagents for 
sequencing provides specific reagents useful for fingerprinting 

20 .applications. Fingerprinting embodiments may be applied 
towards polynucleotide fingerprinting, polypeptide 
fingerprinting, cell and tissue classification, cell and tissue 
temporal development .stage classification, diagnostic tests, 
forensic uses for individual identification, classification of 

25 organisms, and genetic screening of individuals. Mapping 
applications are also described below. 

A. Polynucleotide Fingerprint 
Polynucleotide fingerprinting may use* reagents 
30 similar to those described above for probing a sequence for the 
presence of specific subsequences found therein. Typically, 
the subsequences used for fingerprinting will be longer than 
the sequences used in oligonucleotide sequencing. In 
particular, specific long segments may be used to determine the 
3 5 similarity of different .samples of nucleic acids. They may 
also be used to fingerprint whether specific combinations of 
information are provided therein. Particular probe sequences' 
are selected and attached in a positional manner to a 
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substrate. The means for attachment* may be either using a 
caged biotin method described, e.g., in U.S. S.N. 07/612 , 671 
(caged biotin CIP) , or by another method using targeting 
molecules . For example , a short polypeptide of specific 
sequence may be attached to an oligonucleotide and targeted to 
specific positions on a substrate having antibodies attached 
thereto, the antibodies exhibiting specificity for binding to 
those short peptide sequences. In another embodiment, an 
unnatural nucleotide or similar complementary binding molecule 
may be attached to the fingerprinting probe and the probe 
thereby directed towards complementary sequences on- a VLSIPS 
substrate. Typically, unnatural nucleotides would ie 
preferred, e.g., unnatural optical isomers, which would not 
interfere with natural nucleotide interactions. 

Having produced a substrate with particular 
fingerprint probes attached thereto at positionally defined 
regions, the substrate may be used in a manner quite similar to 
the sequencing embodiment to provide information as to whether 
the fingerprint probes are detecting the corresponding sequence 
in a target sequence. This will often provide information 
similar to a Southern blot hybridization. 

B. Polypeptide Fingerprint 

A polypeptide fingerprint may be performed using 
antibodies which recognize specific antigens on the 
polypeptide. For example, monoclonal antibodies which 
recognize specific sequences or antigens on a polypeptide may 
be used to determine whether those epitopes are found on a 
particular protein.. For example, particular patterns of 
epitopes would be found on various types of proteins. This 
will lead to the discovery that specific epitopes, or antigenic 
determinants, which are characteristic of, e.g., beta sheet 
segments, will be identified as will particular different types 
of domains in various protein types. Thus, a screening method 
may be devised which can classify polypeptides, either native 
or denatured, into various new classes defined by the epitopes 
existing thereon. 
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in addition, once the substrate is generated in the 
manners described above, a target peptide is exposed to the 
substrate. The target may be either native or denatured, 
though the conditions used to denature the polypeptide may 
interfere with the specific interaction between the polypeptide 
and the recognition reagent. This method is not dependent on 
the fact that* the polypeptide is a single chain, thus protein 
complexes may also be fingerprinted using this methodology. 
Structures such as multi-subunit proteins, associations of 
proteins, ribosomes, nucleosomes, and other small cellular 
structures may also be fingerprinted and classified according 
to the presence of specific recognizable features yiereon. 

Peptide fingerprinting may be useful, for example, in 
correlating with particular physiological conditions or 
developmental stages of a cell or organism. Thus, a biological 
sample may be fingerprinted to determine the presence in that 
sample of a plurality of different polypeptides which are each 
individually fingerprinted. In an alternative embodiment, a 
polypeptide itself is not fingerprinted but a biological sample 
is fingerprinted searching for specific epitopes, e.g., 
polypeptide, carbohydrate, nucleic acid, or any of a number of 
other specific recognizable structural features. 

The conditions for the interactions using antibodies 
is described, e.g., in Harlow and Lane (1988) Antibodies: A 
Laboratory Manual . Cold Spring Harbor Press, New York. The 
conditions should be titrated for temperature, buffer 
composition, time, and other important parameters in an 
antibody interaction.. 

C. Cell Classification Scheme 
The present invention can be used for cell 
classification using fingerprinting type technology as 
described above in the polypeptide fingerprint. Classes of 
cells are typically defined by the presence of common functions 
which are usually reflected by structural features. Thus, a 
plant cell is classified differently from an animal cell by a 
number of structural features. Given an unknown cell, the 
present invention provides improved means for distinguishing 
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the different cell types. Once a cell classification scheme is 
developed and the structural features which define it are 
identified using the present invention, homogeneous cell 
population expressing these features may be separated from 
others. Standard cell sorters may be coupled with recognition 
reagents and labels which can distinguish various cell types. 

a. T-Cell Classes 
T-cell classes are defined on the basis of expression 
of particular antigens characteristic of each class. For 
example,, mouse T-cell differentiation markers include the LY 
antigens. With the plurality of different antigens which may 
be tested using antibody or other recognition reagents, new 
populations and classes of cells may be defined. For example, 
different neural cell types may be defined on the basis of cell 
surface antigens. Different tissue types will be defined on 
the basis of tissue specific antigens. Developmental cell 
classes will be similarly defined. All of these screenings can 
make use of the VLSIPS substrates with specific recognition 
molecules attached thereto. The substrates are exposed to the 
cell types directly/ assaying for attachment of cells to 
specific regions, or are exposed to products of a population of 
cells, e.g., a supernatant, or a cell lysate. 

Once a cell classification scheme has been correlated 
with specific structural markers therein, reagents which 
recognize those features may be developed and used in a 
fluorescence . activated cell sorter as described, e.g., in 
Dangl, J. and Herzenberg (1982) J. Immunological Methods 52: 
1*-14 ; and Becton Dickinson, Fluorescence Activated Cell Sorters 
Division, San Jose, California. This will provide a 
homogeneous population of cells whose function has been defined 
by structure. 

b. B-Cell Classes 
The present cell classification scheme may also be 
used to determine specific B-cell classes. For example, B- 
cells specific for producing IgM, IgG, IgD, IgE, and IgA may be 
defined by the internal expression of specific mRNA sequences 
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In one embodiment f the two* may be combined in a 
single incubation step. A particular incubation condition may 
be found which is compatible with both hybridization 
recognition non-hybridization recognition molecules. Thus f 
e.g., an incubation condition may be selected which allows both 
specificity of antibody binding and specificity of nucleic acid 
hybridization* f This allows simultaneous performance of both 
types of interactions on a single matrix. Again, where 
developmental mRNA patterns are correlated with structural 
features, or with probes which are able to hybridize to 
intracellular mRNA populations, a cell sorter may be. used to 
sort specifically those cells having desired mRNA population 
patterns . 

E- Diagnostic Tests 

The present invention also provides the ability to 
perform diagnostic tests. Diagnostic tests typically are based 
upon a fingerprint type assay, which tests for the presence of 
specific diagnostic structural features. Thus, the present 
invention provides means for viral strain identification, 
bacterial strain identification, and other diagnostic tests 
using positionally defined specific reagents. The present 
invention also allows for determining a spectrum of allergies, 
diagnosing a biological sample for any or ail of the above, and 
testing for many other conditions. 

1. Viral Identification 
The present invention provides reagents and 
methodology for identifying viral strains. The specific 
reagents may be either antibodies or recognition proteins which 
bind to specific viral epitopes preferably surface exposed, but 
may make use of internal epitopes, e.g., in a denatured viral 
sample. In an alternative embodiment, the viral genome may be 
probed for specific sequences which are characteristic of 
particular viral strains. As above, a combination of the two 
may be performed simultaneously in a single interaction step, 
or in separate tests, e.g. , for both genetic characteristics 
and epitope characteristics. 
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2. Bacterial Identification 

Similar techniques will be applicable to identifying 
a bacterial source. This may be useful in diagnosing bacterial 
infections, or in classifying sources of particular bacterial 
species. For example, the bacterial assay may be useful in 
determining the natural range of survivability of particular 
strains of bacteria across regions of the country or in 
different ecological niches. 

3. Other Microbiological Identifications 

The present invention provides means for -diagnosis of 
other microbiological and other species, e.g. , protozoal 
species and parasitic species in a biological sample, but also 
provides the means for assaying a combination of different 
infections. For example, a biological specimen may be assayed 
for the presence of any or all of these microbiological 
species. In human diagnostic uses, typical samples will be 
blood, sputum, stool, urine, or other samples. 

4. Allergy Tests 

An immobilized set of antigens may be attached to a 
solid substrate and, instead of the standard skin reaction 
tests, a blood sample may be assayed on such a substrate to 
determine the presence of antibodies, e.g., IgE or othex type 
antibodies, which may be diagnostic of an allergic or 
immunological susceptibility. A standard radioallergosorbent 
test (RAST) may be used to check a much larger .population of 
antigens. 

In addition, an allergy like test may be used to 
diagnose the immunological history of a particular individual. 
For example, by testing the circulating antibodies in a blood 
sample, which reflects the immunological history and memory of 
an individual, it may be determined what infections may not 
have been historically presented to the immune system. In this 
manner, it may be possible to specifically supplement an immune 
system for a short period of time with IgG fractions made up of 
specific types of gamma globulins. Thus, hepatitis gamma 
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globulin injections may be better designed for a particular 
environment which a person is expected to be exposed* This 
also provides the ability to identify genetically equivalent 
individuals who have immunologically different experiences. 
Thus , a blood sample from an individual who has a particular 
combination of circulating antibodies will likely be- different 
from the combination of circulating antibodies found in a 
genetically similar or identical individual. This could allow 
for the distinction between clones of particular animals, e.g., 
mice, rats, or other animals* 

F. Individual Identification • 
The present invention provides the ability to 
fingerprint and identify a genetic individual* This individual 
may be a bacterial or lower microorganism, as described above 
in diagnostic tests, or of a plant or animal. An individual 
may be identified genetically or immunologically, as described. 

1. Genetic 

Genetic fingerprinting has been utilized in comparing 
different related species in Southern hybridization blots. 
Genetic fingerprinting has also been used in' forensic studies, 
see, e.g., Morris et al. (1989) J. Forensic Science 34: 1311- 
1317, and references cited therein. As described above, an 
individual may be identified genetically by a sufficiently 
large number of probes. The likelihood that another individual 
would have an identical pattern over a sufficiently large 
number of probes may be statistically negligible. However, it 
is often quite important that a large number of probes be used 
where the statistical probability of matching is desired to be 
particularly low. In fact, the probes will optimally be 
selected for having high heterogeneity among the population. 
In addition, the fingerprint method may make use of the pattern 
of homologies indicated by a series of more and more stringent 
washes. Then, each position has both a sequence specificity 
and a homology measurement, the combination of which greatly 
increases the number of dimensions and the statistical 
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likelihood of a perfect: pattern match with another genetic 
individual, 

2. Immunological 
5 As indicated above in the diagnostic tests, it is 

possible to identify a particular immune system within a 
genetically homogeneous class of organisms by virtue of her 
immunological history. For example, a large colony of cloned 
mice may be distinguishable by virtue of each immunological 
10 history. For example , one mouse may have had an immunological 
response to exposure to antigen A to which her genetically 
identical sibling may have not been exposed. By virtue of this 
differential history, the first of the pair will likely have a 
high antibody titer against the antigen A whereas her 
15 genetically identical sibling will have not had a response to 

that antigen by virtue of never having been exposed to it. For 
this reason, immune systems may be identified by their 
immunological memories. Thus, immunological experience may 
also be a means for identifying a particular individual at a 
20 particular moment in her lifetime. 

This same immunological screening may be used for 
cPfch(&tf55siorts of identifiable biological products. For example, 
an individual may be identified by her combination of expressed 
proteins* These proteins may reflect a physiological state of 
25 the individual, and would thus be useful in certain 

circumstances where diagnostic tests may be performed. For 
example, an individual may be identified, in part, by the 
presence of particular metabolic products. 

In factf a plant origin may be determined by virtue 
3 0 of having within its genome an unnatural sequence introduced to 
it by genetic breeders. Thus, a marker nucleic acid sequence 
may be introduced as a means to determine whether a genetic 
strain of a plant or animal originated from another particular 
source . 
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G. Gene-tic Screening 

1. test alleles with markers 
The present invention provides for the ability to 
screen for genetic variations of individuals. For example, a 
number of genetic diseases are linked with specific alleles. 
See, e.g., Scriber, C. et al. (eds.) (1989) The Metabolic Bases 
of Inherited Disease . McGraw-Hill, New York. In one 
embodiment, cystic* fibrosis has been correlated with a specific 
gene, see, Gregory et al. (1990) Nature 347: 382-386. A number 
of alleles are correlated with specific genetic deficiencies. 
See, e.g^, McKusick, V. (1990) Genetic Inheritance -in Man: 
Catalogs of Autosomal Dominant. Autosomal Recessive, and X- 
1 inked Phenotypes , Johns Hopkins University Press, Baltimore; 
Ott, J. (1985) Analysis of Human Genetic Linkage , Johns Hopkins 
University Press, Baltimore; Track, R. et al. (1989) -Banbury 
Report 32: DNA Technology and Forensic Science , Cold Spring 
Harbor Press, New York; each of which is hereby incorporated 
herein by reference. 

2. Amniocentesis 
Typically, amniocentesis is used to determine whether 
chromosome translocations have occurred. The mapping procedure 
may provide the means for determining whether these 
translocations have occurred, and for detecting particular 
alleles of various markers. 

III. MAPPING 

A* Positionallv Located Clones 
The present invention allows for the positional 
location of specific clones useful for mapping. For example, 
caged biotin may be used for specifically positioning a probe 
to a location on a matrix pattern. 

In addition, the specific probes may be positionally 
directed to specific locations on a substrate by targeting. 
For example, polypeptide specific recognition reagents may be 
attached to oligonucleotide sequences which can be 
complementarily targeted to specific locations on a vlsips 
substrate. Hybridization conditions, as applied for 
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oligonucleotide probes, will be used to target 'the reagents to 
locations on a substrate having complementary oligonucleotides 
synthesized thereon. In another embodiment, oligonucleotide 
probes may be attached to specific polypeptide targeting 
reagents such as an antigen or antibody. These reagents can be 
directed towards a complementary antigen or antibody already 
attached to a- VI^SIPS substrate. 

In another embodiment, an unnatural nucleotide which 
does not interfere with natural nucleotide complementary 
hybridization may be used to target oligonucleotides to 
particular positipns on a substrate. Unnatural optical isomers 
of natural nucleotides should be ideal candidates. _ 

In this way, short probes may be used to determine 
the mapping of long targets or long targets may be used to map 
the position of shorter probes. See, e.g., Craig et al. 1990 
Nuc. Acids Res. 18: 2653-2660, 

B. Positionally Defined Clones 

Positionally defined clones may be transferred to a 
new substrate by either physical transfer or by synthetic 
means. Synthetic means may involve either a production of the 
probe on the substrate using the VLSIPS synthetic methods, or 
may involve the attachment of a targeting sequence made by 
VLSIPS synthetic methods which will target that positionally 
defined clone to a position on a new substrate. Both methods 
will provide a substrate having a number of positionally 
defined probes useful in mapping* 

IX- Conclusion 

The present inventions, provide greatly improved 
methods and apparatus for synthesis of polymers on substrates. 
It is to be understood that the above description is intended 
to be illustrative and not restrictive. Many embodiments will 
be apparent to those of skill in the art upon reviewing the 
above description. By way of example, the invention has been 
described primarily with reference to the use of photoremovable 
protective groups, but it will be readily recognized by those 
of skill in the art that sources of radiation other than light 
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could also be used. For example, ixr some embodiments it may be 
desirable to use protective groups which are sensitive to 
electron beam irradiation, x-ray irradiation, in combination 
with electron beam lithograph, or x-ray lithography techniques. 
Alternatively, the group could be removed by exposure to an 
electric current. The scope of the invention should, 
therefore, be determined not with reference to the above 
description, but should instead be determined with reference to 
the appended claims, along with the full scope of equivalents 
to which such claims are entitled. 

' All publications and patent applications -referred to 
herein are incorporated by reference to the same extent as if 
each individual publication -or patent application was 
specifically and individually incorporated by reference. The 
present invention now being fully described, it will be 
apparent to one of ordinary skill in the art that many changes 
" and modifications can be made thereto without departing from 
the spirit or scope of the appended claims. 



203 



WHAT IS CLAIMED IS : 

1. A composition comprising a plurality of 
positionally distinguishable sequence specific reagents 
attached to a solid substrate, which reagents are capable of 
specifically binding to a predetermined subunit sequence of a 
preselected mulfci-subunit length having at least three 
subunits, said reagents representing substantially all possible 
sequences of said preselected length* 

2. A composition of Claim 1, wherein said subunit 
sequence is a polynucleotide or a polypeptide. 

3. A composition of Claim 1, wherein said 
preselected multi-subunit length is five subunits and said 
subunit sequence is a polynucleotide sequence, 

4. A composition of Claim 1, wherein said specific 
reagent is an oligonucleotide of at least about five 
nucleotides. 

5. A composition of Claim 1, wherein said specific 
reagent is a monoclonal antibody. 

6. A composition of Claim 1, wherein said specific 
reagents are all attached to a single solid substrate. 

7. A composition of Claim 1, wherein said reagents 
comprise about 3000 different sequences. 



204 



8. A composition of Claim 1, wherein said reagents 
represents at least about 25% of the possible subsequences of 
said preselected length. 

5 9. A composition of Claim 1, wherein said reagents 

are localized in regions of the substrate having a d.ensity of 
at least 25 regions per square centimeter. 

"10. A composition of Claim 6, wherein said substrate 
10 has a surface area of less than about 4 square centimeters. 

11. A method of analyzing a sequence of a 
polynucleotide or a polypeptide, said method comprising the 
step of: 

15 a) exposing said polynucleotide or polypeptide 

to a composition of Claim 1. 

12. A method of identifying or comparing a target 
sequence with a reference, said method comprising the step of: 

20 ~ a) exposing said target sequence to a 

composition of Claim 1; 



b) determining the pattern of positions of 
said reagents which specifically interact 

25 with said target sequence; and 

c) comparing said pattern with the pattern 
exhibited by said reference when exposed to 
said composition. 
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13. A method for sequencing a segment of a 
polynucleotide comprising the steps of: 

a) combining: 

i) a substrate comprising a plurality of 
chemically synthesized and 
positionally distinguishable 
oligonucleotides capable of 
recognizing defined oligonucleotide 
sequences; and 
ii) a target polynucleotide; thereby 

forming high fidelity matched duplex 
structures of complementary 
subsequences of known sequence; and 
b) determining which of said reagents have 

specifically interacted with subsequences 
in said target polynucleotide. 

14. A method of Claim 13, wherein said segment is 
substantially the entire length of said polynucleotide. 

o 

15. A method for sequencing a polymer, said method 

comprising the steps of: 

a) preparing a plurality of reagents which 

each specifically bind to a subsequence of 
25 preselected length; 



b) positionally attaching each of said 
reagents to one or more solid phase 
substrates, thereby producing substrates o 
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c) 



positionally de'f-inable sequence specific 
probes ; 

combining said substrates with a target 
polymer whose sequence is to be determined; 



and 



determining which of said reagents have 
specifically interacted with subsequences 



in said target polymer. 



16. 



A method of Claim 15, wherein said substrates 



are beads. 

17. A method of Claim 15, wherein said plurality of 
reagents comprise substantially all possible subsequences of 
said preselected length found in said target. 

18. A method of Claim 15, wherein said solid phase 
substrates are a single substrate having attached thereto 
reagents recognizing substantially all possible subsequences of 

- preselected length found in said target. 

19. A method of Claim 15, further comprising the 
step of analyzing a plurality of said recognized subsequences 
to assemble a sequence of said target polymer. 



20. A method of Claim 16, wherein at least some of 



said plurality of substrates have one subsequence specific 
reagent attached thereto, and said substrates are coded to 
indicate the specificity of said reagent. 



207 



21. A method of using a fluorescent nucleotide to 
detect interactions with oligonucleotide probes of known 
sequence, said method comprising: 

a) attaching said nucleotide to a target 
f unknown polynucleotide sequence, and 

b) exposing said target polynucleotide 
sequence to a collection of positionally 
defined oligonucleotide probes of known 
sequences to determine the sequences of 
said probes which interact with said 
target. 



22* A method of Claim 21 f further comprising the 

step of: 

a) collating said known sequences to determine 
the overlaps of said known sequences to 
determine the sequence of said target 
sequence • 

23. A method of mapping a plurality of sequences 
relative to one another, said method comprising: 

a) preparing a substrate having a plurality of 
positionally attached sequence specific 
probes" are. attached^ 

b) exposing each of said sequences to said 
substrate, thereby determining the patterns 
of interaction between said sequence 
specific probes and said sequences ; and 
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c) determining the relative locations of said sequence specific 

probe interactions on said sequences to determine the overlaps 
and order of said sequences. 



24. 

oligonucleotides. 



A method of claim 23, wherein said sequence specific probes are 



25. A method of claim 23, wherein said sequences are nucleic acid 



10 sequences. 



of: 



15 tive group; 



26. A method of preparing sequences on a substrate comprising the steps 

a) exposing a first region of said substrate to an activator to remove a pro-tec- 

b) exposing at least said first region to a first monomer; 

c) exposing a second region to an activator to remove a protective group; and 

d) exposing at least said second region to a second monomer. 



20 27. The method as recited in claim 26 wherein said steps of exposing to an 

activator use an activator selected from the group consisting of ion beams, electron beams, 
gamma rays, x-rays, ultra-violet radiation, light, infra-red radiation, microwaves, electric 
currents, radiowaves, and combinations thereof. 

25 28. The method as recited in claim 26 wherein said protective groups are 

photosensi-tive protective groups. 

29. The method as recited in claim 26 wherein said steps of exposing to an 
activator are steps of applying light to selected regions of said substrate. 

30 

30. The method as recited in claim 26 wherein said first and the second 
monomers are amino acids. 
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3 1 . The method as recited in claim 26 further comprising a step of 
screening sequences on said substrate for affinity with a receptor, said step of screening 
further comprising the step of exposing said substrate to said receptor and testing for the pres- 
ence of said receptor in said first and said second region. 

5 

32. The method as recited in claim 3 1 wherein said receptor is an antibody. 

33. The method as recited in claim 26 wherein said substrate is selected 
from the group consisting of polymerized Langmuir Blodgett film, functionalized glass, 

10 germanium, silicon, polymers, (poly)tetrafluoro-ethylene, polystyrene, gallium arsenide, and 
combinations thereof. 

34. The method as recited in claim 26 wherein said protective group is 
selected from the group consisting of ortho-nitrobenzyl deriva-tives, 6-nitroveratryIoxy- 

15 carbonyl, 2-nitrobenzyloxy-car-bonyl, cinnamoyl derivatives, and mixtures thereof. 

35. The method as recited in claim 26 wherein said first and second 
regions each have total areas of less than 1 cm2. 

20 36. The method as recited in claim 26 wherein said first and second 

regions each have total areas of between about 1 \im 2 and 10,000 jim 2 . 

37. The method as recited in claim 29 wherein said light is monochromatic 

coherent light. 

25 

38. The method as recited in claim 26 wherein said steps of exposing to an 
activator are carried out with a solution in contact with said substrate. 

39. The method as recited in claim 38 wherein said solution further 
30 comprises said first or said second monomer. 

40. The method as recited in claim 31 wherein said receptor further 
comprises a marker selected from the group consisting of radioactive markers and fluorescent 
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markers and wherein said step of testing for the presence of the receptor is a step of detecting 
said marker. 

41 . The method as recited in claim 26 wherein the steps of exposing fo an 
5 activator further comprise steps of: 

a) placing a mask adjacent to said substrate, said mask having substantially 
transparent regions and sub-stantially opaque regions at a wavelength of light; and 

b) illuminating said mask with a light source, said light source producing at 
least said wavelength of light 

10 

42. The method as recited in claim 26 wherein said steps are repeated so as 
to synthesize 10 3 or more different sequences on said substrate. 

43. The method as recited in claim 26 wherein said steps are repealed so as 
15 to synthesize 10 6 or more dif-fer-ent sequences on said substrate. 

44. A method of synthesizing a plurality of chemical sequences, said 
chemical sequences comprising at least a first and a second monomer, comprising the steps 
of: 

20 a) at a first region on a substrate having at least a first and a second region, 

said first and said second region comprising a" substrate protective group, activating said first 

region to remove said substrate protective group in said first region; 

b) exposing said first monomer to said sub-strate, said first monomer further 

comprising a first monomer protective group, said first monomer binding at said first region; 
25 c) activating said second region to remove said substrate protective group in 

said second region; 

d) exposing said second monomer to said sub-strate, said second monomer 

further comprising a second monomer protective group, said second monomer binding at said 

second region; 

30 e) activating said first region to remove said first monomer protective group; 

f) exposing a third monomer to said sub-strate, said third monomer binding at 
said first region to produce a first sequence; 

g) activating said second region to remove said second monomer protective 

group; and 
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h) exposing a fourth monomer to said sub-strate, said fourth monomer binding 
at said second region to produce a second sequence, said second sequence different from said 
first sequence. 

5 45. A method of synthesizing a plurality of chemi-cal sequences, said 

chemical sequences comprising at least a first and a second monomer, comprising the steps 
of: 

a) on a substrate having at least a first and a second region deactivating said 
first region to provide a first protective group in said first region; 
10 b) exposing said first monomer to said substrate, said first monomer binding at 

said second region; 

c) removing said protective group in said first region; 

d) deactivating said second region to provide a second protective group in said 

second region; 

15 e) exposing said second monomer to said substrate, said second monomer 

binding at said first region; 

f) removing said protective group in said second region; 

g) deactivating said first region to provide a protective group in said first 

region; 

20 h) exposing a third monomer to said sub-strate, said third monomer binding at 

said second region to produce a first sequence; 

i) removing said protective group in said first region; and 

j) exposing a fourth monomer to said sub-strate, said fourth monomer binding 
at said first region to produce a second sequence, said second sequence different than said 
25 first sequence. 

46. A method of synthesizing at least a first polymer sequence and a 
second polymer sequence on a substrate, said first polymer sequence having a different 
monomer sequence from said second polymer sequence, comprising the steps of: 
30 a) inserting a first mask between said sub-strate and an energy source, said 

mask having first regions and second regions, said first regions permitting passage of energy 
from said source, said second regions blocking energy from said source; 



212 



b) directing energy from said source at said substrate, said energy removing a 
protective group from first portions of said first polymer under said first regions of said first 
mask; 

c) exposing a second portion of said first polymer to said substrate to create a 
5 first polymer sequence; 

d) inserting a second mask between said sub-strate and said energy source, 
said second mask having first regions and second regions; 

e) directing energy from said source at said substrate, said energy removing 
said protective group under said first regions of said second mask from first portions of said 

1 0 second polymer; and 

" - f) exposing a second portion of said second polymer to said substrate, said 
second portion of said second polymer binding with said first portion of said second polymer 
to create a polymer 8 second sequence. 

15 47. The method as recited in claim 46 wherein said energy is selected from 

the group consisting of ion beams, electron beams, gamma rays, x-rays, ultra-violet radiation, 
light, infra-red radiation, microwaves, electric fields, radio-waves, and combinations thereof. 

48. The method as recited in claim 44 wherein said protective groups are 
20 photosensi-tive protective groups, 

. 49. The method as recited in claims 44 or 45 wherein said steps of 
activating and deactivating are steps of applying light to selected regions of said substrate. 

25 50. ■ • The method as recited in claims 44 or 45 wherein said first and said 

second monomers are amino acids, 

5 1 . The method as recited in claims 44, 45 or 46 further comprising a step 
of screening said first and said second sequences for affinity with a first receptor, said step of 

30 screening further comprising a step of exposing said substrate to said first receptor and testing 
for the presence of said first receptor. 

52. The method as recited in claim 5 1 wherein said step of screening is a 
step of screening with antibodies. 
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53. The method as recited in claims 44, 45 or 46 wherein said substrate is 
selected from the group consisting of a polymerized Langmuir Blodgett film, functionalized 
glass, germanium, silicon, polymers, (poly)tetrafluoro-ethylene, gallium arsenide, gallium 

5 phosphide, silicon oxide, silicon nitride and combinations thereof. 

54. The method as recited in claim 44 wherein said protective group, said 
first monomer protective group, and said second monomer protective group are selected from 
the group consisting of ortho-nitrobenzyl derivatives, 6-nitroveratryloxycarbonyl, 2-nitro- 

1 0 benzyloxy—car-bonyl, and mixtures thereof. 

55. The method as recited in claim 45 wherein said protective group is a 
cinnamate group. 

15 56. The method as recited in claims 44 or 45 wherein said first and second 

regions each have total areas of less than 1 cm 2 . 

57. The method as recited in claims 44 or 45 wherein said first and second 
regions each have total areas of between about 1 jam 2 and 10,000 p.m 2 . 

20 

58. The method as recited in claim 49 wherein said light is monochromatic 

coherent light. 

59. The method as recited in claim 44 wherein said steps of activating are 
25 carried out with a solution in contact with said substrate. 

60. The method as recited in claim 59 wherein said solution further 
comprises a monomer. 

30 61. The method as recited in claim 5 1 wherein said receptor further 

comprises a marker selected from the group consisting of radioactive markers and fluorescent 
markers and wherein said step of testing for the presence of the receptor is a step of detecting 
said marker. 



214 



62. The method as recited in claims 44 or 45 wherein two of said first, said 
second, said third, and said fourth monomers are the same monomers. 



10 



63. The method as recited in claim 46 wherein the step of inserting a 
second mask is a step of translating said first mask from a first position to a second position. 

64. The method as recited in claim 66 wherein the step of inserting a 
second mask is a step of rotating said first mask. 

" _65. The method as recited in claim 51 further comprising the step of 
exposing said substrate to a second, labeled receptor, said second, labeled receptor binding at 
multiple sites on said first receptor. 



1 5 66. The method as recited in claim 65 wherein said first receptor is an 

antibody of a first animal species and said second receptor is an antibody derived from a 
second species and directed at said first species. 



67. The method as recited in claim 44 wherein: 

20 a) said first monomer protective group is removable upon exposure to a first 

wavelength of light; 

b) said second monomer protective group is removable upon exposure to a 
second wavelength of light; 

c) said step of activating said first region to remove said first monomer 
25 protective group is a step of exposing substantially all of said substrate to said first 

wavelength of light; and 

d) said step of activating said second region to remove said second monomer 
protective group is a step of exposing substantially all of said substrate to said second 
wavelength of light. 

30 

68. A method as recited in claims 44 or 46 wherein said protective groups 
are of the foim: 
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where Ri is alkoxy, alkyl, halo, aryl, alkenyl, or hydrogen; R2 is alkoxy, alkyl, halo, aryl, 
1 5 nitro, or hydrogen; R 3 is alkoxy, alkyl, halo, nitro, aryl, or hydrogen; R4 is alkoxy, alkyl, 

hydrogen, aryl, halo, or nitro; and R5 is alkyl, alkynyl, cyano, alkoxy, hydrogen, halo, aryl, or 
alkenyl. 

69. A method of screening a plurality of amino acid sequences for binding 
20 with a receptor comprising the steps of: 

a) on a glass plate having at least a first surface, said at least a first surface 
comprising a photoprotective material selected from the group consisting of nitro veratryloxy 
carbonyl and nitrobenzyloxy carbonyl, reacting said at least a first surface with t- 
butoxycarbonyl for storage, said glass plate substantially transparent to at least ultraviolet 

25 light; 

b) exposing said at least a first surface to TFA to remove said t- 
butoxycarbonyl; 

c) placing said glass plate on a reactor, said reactor comprising a reactor space, 
said at least a first surface exposed to said reactor space; 

30 .d) placing a mask at a first position on said glass plate, said mask comprising 

first locations and second locations, said first locations substantially transparent to at least 
ultraviolet light and said second locations substantially opaque to at least ultraviolet light, 
said second locations comprising a light blocking material on a first surface of said mask, 
said first surface of said mask placed in contact with said glass plate; 
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e) filling said reactor space with a reaction solution; 

f) illuminating said mask with at least ultraviolet light, said ultraviolet light 
removing said photoprotective material from said at least a first surface of said glass plate 
under said first locations of said mask; 

g) exposing said first surface to a first amino acid, said first amino acid 
binding to regions of said at least a first surface from which said photoprotective material was 
removed, said first amino acid comprising said photoprotective group at a terminus thereof; 

h) placing a mask in contact with said glass plate at a second position; 

i) illuminating said mask with at least ultraviolet light, said ultraviolet light 
removing said photoprotective material from said at least a first surface of said glass plate 
under said" first locations of said mask; 

j) exposing said at least a first surface to a second amino acid, said second 
amino acid binding to regions of said at least a first surface from which said photoprotective 
material was removed, said second amino acid comprising said photoprotective group at a 

terminus thereof; 

k) placing a mask in contact with said glass plate at a third "position; 

1) illuminating said mask with at least ultraviolet light, said ultraviolet light * 
removing said photoprotective material from said at least a first surface of said glass plate 
under said first locations of said mask; 

m) exposing said at least a first surface to a third amino acid, said third amino 
acid binding to regions of said at least a first surface from which said photoprotective 
material was removed; 

n) placing a mask in contact with said glass plate at a fourth position; 

o) illuminating said mask with at least ultraviolet light, said ultraviolet light 
removing said photoprotective material from said at least a first surface of said glass plate 
under said first locations of said mask; 

p) exposing said at least a first surface to a fourth amino acid, said fourth 
amino acid binding to regions of said at least a first surface from which said photoprotective 
material was removed, said at least a first surface comprising at least first, second, third, and 
fourth amino acid sequences; 

q) exposing said at least a first surface to an antibody of interest, said antibody 
of interest binding more strongly to at least one of said first, said second, said third, or said 
fourth amino acid sequences; 
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r) exposing said at least a first surface to a receptor, said receptor recognizing 
said antibody of interest and binding at multiple locations thereof, said receptor comprising 
fluorescein; 

s) exposing said at least a first surface to light, said first surface fluorescing in 
5 at least a region where said more strongly bound amino acid sequence is located; and 

t) detecting and recording fluoresced light intensity as a function of location 
across said at least a first surface. 

70. A method of identifying at least one peptide sequence for binding with 
1 0 a receptor com-pris-ing the steps of: 

" a) on a substrate having a plurality of polypeptides, each having a 
photoremovable protective group, iiTadiating first selected polypeptides to remove said 
protective group; 

b) contacting said polypeptides with a first amino acid to create a first 
15 sequence, second polypeptides on said substrate comprising a second sequence; and 

c) identifying which of said first or said second sequence binds with said 

receptor. 



71. The method as recited in claim 70 wherein said step of identifying 
20 further comprises a step of detecting the presence of a marker selected from the group 

consisting of radioactive markers and fluorescent markers in said receptor. 

72. The method as recited in claim 70 wherein said step of irradiating is a 
step of masking a light source with a mask, said mask comprising first transparent regions 

25 and second opaque regions. 



73. The method as recited in claim 72 wherein the step of identifying 
further comprises the steps of: 

a) exposing a first receptor to said substrate; and 
30 b) exposing a receptor to said first receptor to said substrate, said receptor to 

said first receptor comprising a marker. 

74. The method as recited in claim 73 wherein said marker is selected from 
the group consisting of radioactive markers and fluorescent markers. 
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75. The method as recited in claim 73 wherein said first receptor is an 
antibody from a first species and said receptor to said first receptor is an antibody from a 
second species directed at said first species, 

5 

76. A method for screening a plurality of polymers for biological activity 
comprising exposing a receptor to a substrate having said plurality of said polymers on a 
surface thereof, each of s^id polymers occupying an area of less than about 1 cm2. . 

10 77. A method for screening as recited in claim 73 wherein said area is less 

than about 0.1 cm 2 . 

78. A method as recited in claim 73 wherein said area is less than about 

10,000 jam 2 . 

15 

79. A method as recited in claim 73 wherein said area is less than about 

100 urn 2 . 



80. Apparatus for preparation of a plurality of polymers comprising: 

20 a) a substrate with a sur-face, said surface comprising a reactive portion, said 

reactive portion activated upon exposure to an energy source so as to react with a monomer; 
and 

b) means for selectively protecting and exposing portions of said surface from 
" said energy source. 

25 

8 1 . Apparatus as recited in claim 80 wherein said reactive portion further 
comprises a protective group, said protective group of the form: 



30 
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where Ri is alkoxy, alkyl, halo, aryl, alkenyl, or hydrogen; R 2 is alkoxy, alkyl, halo, aryl, 
nitro, or hydrogen; R 3 is alkoxy, alkyl, halo, nitro, aryl, or hydrogen; R4 is alkoxy, alkyl, 
hydrogen, aryl, halo, or pitro; and R 5 is alkyl, alkynyl, cyano, alkoxy, hydrogen, halo, aryl, or 
alkenyl. 

10 

" ■ 82. Apparatus as recited in claim 80 wherein said reactive portion further 
comprises linker molecules. 

83. Apparatus as recited in claim 82 wherein said linker molecules are 
15 selected from the group consisting of ethylene glycol oligomers, diamines, diacids, amino 

acids, and com-bin-a-tions thereof. 

84. Apparatus as recited in claim 80 wherein said means for selectively 
protecting further comprises a mask. 

20 

85. Apparatus as recited in claim 80 wherein said means for selectively 
protecting further comprises a light valve. 



- - 86. Apparatus as recited in claim 80 wherein said energy source is a light 

25 source. 

87. Apparatus as recited in claim 80 wherein said reactive portion further 
comprises a composition selected from the group consisting of nitroveratryloxy carbonyl, 
nitrobenzyloxy carbonyl, dimethyl-dimethoxybenzyloxy carbonyl, 5-bromo-7-nitroindolinyl, 

30 hydroxy-2-methyl cinnamoyl, and 2-oxymethylene anthraquinone. 

88. Apparatus for preparation of a substrate having a plurality of amino 
acid sequences thereon, said appara-tus comprising: 

a) a substrate with a surface; 
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b) a protective group on said sur-face, said protective group removable upon 
exposure to an energy source, said energy source selected from the group consisting of light, 
electron beams, and x-ray radiation; 

c) means for directing said energy source at selected locations on said surface; 

and 

d) means for exposing amino acids to said surface for binding to said surface. 

89. Apparatus for screening polymers comprising a substrate with a 
surface, said surface comprising at least two predefined regions, said predefined regions 
containing different monomer sequences thereon, said predefined regions each occupying an 
area of less than about 0.1 cm 2 . 

90. Apparatus as recited in claim 89 wherein said area is less than about 

0.01 cm 2 . 

91 . Apparatus as recited in claim 89 wherein said area is less than 10000 

|im 2 . 

92. Apparatus as recited in claim 89 wherein said area is less than about 

100 ^m 2 . 

93. Apparatus as recited in claims 89, 90, 91, or 92 wherein said monomer 
sequences are substantially pure within said predefined regions. 

94. A substrate for screening for biological activity, said substrate 
comprising 10 3 or more different ligands on a surface thereof in predefined regions. 

95. A substrate as recited in claim 94 wherein said substrate comprises 104 
or more different ligands in predefined regions. 

96. A substrate as recited in claim 94 wherein said substrate comprises 105 
or more different ligands in predefined regions. 
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97. A substrate as recited in claim 94 wherein said substrate 
comprises 106 or more different ligands in predefined regions. 

98. A substrate as recited in claims 94, 95, 96, or 97 wherein the ligands 

are peptides. 

99. A substrate as recited in claim 89 wherein said ligands are substantially 
pure within said predefined regions. 

1 00. Apparatus for screening for biological activity comprising: 

" ' a) a substrate comprising a plurality of poly-mer sequences, said polymer 
sequences attached to a surface of said substrate at known locations on said substrate, each of 
said sequences occupying an area of less than about 0.1 cm 2 ; 

b) means for exposing said substrate to a receptor, said receptor marked with a 
fluorescent marker, said receptor binding with at least one of said sequences; and 

c) means for detecting a location of said fluorescent marker on said substrate. 

101. Apparatus for forming a plurality of polymer sequences comprising: 

a) a substrate, said substrate having at least a first surface and a second 
surface, said second surface comprising a photoremovable protective material, said substrate 
substantially transparent to at least light of a first wavelength; 

b) a reactor body, said reactor body having a mounting surface with a reaction 
fluid cavity therein, said second surface maintained in a sealed relationship with said 
mounting surface; and 

c) a light source for producing light of at least said first wavelength and 
directed at a surface of said substrate. 

102. Apparatus as recited in claim 101 wherein said light source is directed 
at said first surface. 

103. Apparatus as recited in claim 101 further comprising a mask, said 
mask placed between said light sourceand said first surface, said mask having first regions 
substantially transparent to said first wavelength of light and second regions substantially 
opaque to said first wavelength of light. 
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104. Apparatus as recited in claim 101 wherein said cavity comprises a fluid 
inlet and a fluid outlet, said fluid inlet connected to a pump for flowing reaction fluids 
through said cavity. 

105. Apparatus as recited in claim 101 wherein said cavity further 
comprises a plurality of raised sections. 

r 

106. Apparatus as recited in claim 103 wherein said mask further comprises 

a glass plate. 

107. Apparatus as recited in claim 106 wherein said opaque regions on said 
mask comprise chrome. 

108. Apparatus as recited in claim 101 wherein at least a portion of said 
second surface comprises a second photoremovable protective group, said second 
photoremovable protective group activatable upon exposure to light of a second wavelength. 

109. Apparatus as recited in claim 101 further comprising first and second 
gaskets on said mounting surface and means for maintaining a vacuum between said first and 
second gaskets. 

110. Apparatus as recited in claim 101 wherein said substrate has a 
thickness of less than "1 mm. 

111. Apparatus as recited in claim 101 wherein said substrate has a 
thickness of less than 0.5 mm. 

1 12. Apparatus as recited in claim 101 wherein said substrate has a 
thickness of less than 0.05 mm. 

113. Apparatus as recited in claim 103 wherein said mask is in direct 
contact with said substrate. 
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1 14. Apparatus as recited in claim 113 wherein opaque regions of said mask 
are placed in direct contact with said substrate. 



115. Apparatus as recited in claim 101 further comprising a liquid crystal 
light valve for selectively controlling exposure of light to said substrate. 

116. Apparatus as recited in claim 101 further comprising a fiber optic 
faceplate between said light source and said substrate. 

117. Apparatus as recited in claim 101 further comprising a molecular 
microclyster between said light source and said substrate. 

118. Apparatus as recited in claim 101 wherein said cavity comprises light 
absorptive materials. 

1 19. Apparatus as recited in claim 118 wherein said light absorptive 
material is N,N-diethylainino 2,4-dinitrobenzene. 

120. Apparatus as recited in claim 101 wherein said cavity is filled with a 
carrier solution. 

121. Apparatus as recited in claim 120 wherein said carrier material 
comprises a material selected from the group of 1-hydroxybenzotriazole, dimethylformamide, 
diisopropylethylamine, and benzotriazolyl-n-oxy- 
tris(dimethylamino)phosphoriumhexafluorophosphate. 

122. Apparatus as recited in claim 101 wherein said substrate is a fiber optic 

faceplate. 

123. Apparatus for detection of fluorescent marked regions on a substrate 

comprising: 

a) a light source for directing light at a surface of said substrate; 

b) a means for detecting light fluoresced from said surface in response to said 

light source; 
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c) means for translating said substrate from a first position to a second 

position; and 

d) means for storing fluoresced light intensity as a function of location on said 
substrate, said means for storing connected to said means for translating and said means for 
detecting. 

124. Apparatus as recited in claim 123 further comprising video display 
means for displaying light intensity as a function of location on said substrate. 

125. Apparatus as recited in claim 123 wherein said means for detecting 
comprises Vphotomultiplier tube and a photon counter. 

126. Apparatus as recited in claim 124 wherein said means for directing 
light further comprises a dichroic mirror, said mirror reflecting light at a wavelength of said 
light source and passing said fluoresced light. 

127. Apparatus as recited in claim 125 wherein said light source is a laser 

light source. 

128. Apparatus as recited in claim 126 wherein said means for storing is a 
programmed digital computer. 

129. Apparatus as recited in claim 127 further comprising a microscope, 

* said light source directed at said substrate through said microscope, said means for detecting 
receiving light from said microscope. 

130. A method of identifying at least one polymer for binding with a 
receptor com-pris-ing the steps of: 

a) on a substrate, said substrate comprising polymers immobilized on a surface 
of said substrate, said polymers comprising a photoremovable protective group, irradiating a 
first region of said substrate without irradiating a second region of said substrate to remove 
said protecting group from said polymers in said first region; and 
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b) contacting said substrate with a first monomer to couple said monomer to 
said polymer in said first region, forming a first polymer on said substrate in said first region 
that is different from said polymer in said second region. 

131. The method as recited in claim 130 wherein said step of irradiating is a 
step of masking a light source with a mask, said mask comprising first transparent regions 
and second opaque regions, said transparent regions transmitting light from said source to 
said first regions, and said opaque regions blocking light from said source to said second 
regions. 

" ~ _ 132. The method as recited in claim 130 wherein said first and second 
regions each have total areas less than about 1 cm2. 

133. The method as recited in claim 130 wherein said steps of irradiating 
are conducted with a monochromatic light. 

134. The method as recited in claim 130 wherein said step of irradiating a 
first region is a step of masking a light source with a mask located in a first position, and 
wherein said step of irradiating a second region is a step of masking a light source with said 
mask located in a second position. 

135. The method as recited in claim 130 wherein the step of irradiating 
further comprises the steps of: 

a) placing a mask adjacent to said substrate, said mask having 
substantially transparent regions and substantially opaque regions at a wavelength of light; 
and 

b) illuminating said mask with a light source, said light source 
producing at least said wavelength of light. 
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136. An array of oligonucleotides, the array comprising: 
a planar solid support having at least a first surface; and 

at least 1000 different oligonucleotides attached to the first surface of the solid 
support in an areas of less than 1 cm 2 , wherein each of the different oligonucleotides is 
attached to the surface of the solid support in a different known location, and has a different 
sequence. 

137. The array of claim 136, wherein each different oligonucleotides is 
from about 4 to about 20 nucleotides in length. 

" ' 138. The array of claim 136, wherein each different oligonucleotide is at 
least 12 nucleotides in length. 

139. The array of claim 136, wherein each different oligonucleotide is 2- 
100 nucleotides in length. 

140. The array of claim 136, wherein the array comprises at least 1 ,000 
different oligonucleotides attached to the first surface of the solid support. 

141. The array of claim 136, wherein the array comprises at least 10,000 
different oligonucleotides attached to the first surface of the solid support. 

1 42. The array of claim 136, wherein each of the different known locations 
is physically separated from each other of the known locations. 

143. The array of claim 136, wherein said planar solid support is glass. 

144. The array of claim 136, wherein said oligonucleotides are attached to 
the first surface of the solid support through a linker group. 

145. The array of claim 136, wherein the oligonucleotide in the different 
known locations are at least 20% pure. 
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146. The array of claim 1 36, wherein the oligonucleotides in the different 
known locations are at least 50% pure. 

147. The array of claim 136, wherein the oligonucleotide in the different 
known locations are at least 80% pure. 

148. The array of claim 136, wherein the oligonucleotide in the different 
known locations are at Ipast 90% pure. 

149. The array of claim 136, wherein the oligonucleotides in the different 
known locations are of known sequences. 

150. The array of claim 136, wherein said array is produced by a binary 
synthesis process, said process comprising the steps of: 

providing a planar solid support, said solid support having a plurality of compounds 
immobilized on a surface thereof, said compounds having protecting groups coupled thereto; 

deprotecting a first portion of said plurality of compounds on said surface and not a 
second portion of said plurality of compounds; 

reacting said first portion of said plurality of compounds with a first component of 
said oligonucleotide; 

deprotecting at least a third portion of said plurality of compounds on said surface, 
said third portion comprising a fraction of said first portion of said plurality of compounds; 
reacting said at least third portion of said plurality of compounds with a second component of 
said oligonucleotide; and 

optionally repeating said binary synthesis steps to produce said oligonucleotide array. 

151. An array of nucleic acids, the array comprising: 
a planar support having at least a first surface; and 

at least 1000 different nucleic acids attached to the first surface of the solid support 
within an area of 1 cm2, wherein each of the different nucleic acids is attached to the surface 
of the solid support in a different known location, has a different determinable sequence. 

152. The array of claim 151, wherein each different nucleic acid is at least 
20 nucleotides in length. 
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153. The array of claim 151, wherein the array comprises at least 1 ,000 
different nucleic acids attached to the first surface of the solid support. 

5 1 54. The array of claim 151, wherein the array comprises at least 1 0,000 

different nucleic acids attached to the first surface of the solid support. 



155. TJie array of claim 151, wherein each of the different known locations 
is physically separated from each of the other known locations. 

10 

*' * 1 56. The array of claim 151, wherein said planar solid support is glass. 

1 57. The array of claim 151, wherein said nucleic acids are attached to the 
first surface of the solid support through a linker group. 

15 

158. The array of claim 151, wherein the nucleic acids in the different 
known locations comprise nucleic acids that are at least 20% pure. 



1 59. The array of claim 151, wherein the nucleic acid in the different known 
20 locations comprise nucleic acids that are at least 50% pure. 

1 60. The array of claim 151, wherein the nucleic acids in the different 
known locations are at least 80% pure. 

25 161. The array of claim 1 5 1 , the nucleic acids in the different known 

locations are at least 90% pure. 



1 62. The array of claim 151, wherein said array is produced by a binary 
synthesis process, said process comprising the steps of: 
30 providing a planar, solid support, said solid support having a plurality of compounds 

immobilized on a surface thereof, said compounds having protecting groups coupled thereto; 
deprotecting a first portion of said plurality of compounds on said surface and not a second 
portion of said plurality of compounds; % 

reacting said first portion of said plurality of compounds with a first reactant; 
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deprotecting at least a third portion of said plurality of compounds on said surface, said third 
portion comprising a fraction of said first portion of said plurality of compounds; 
reacting said at least third portion of said plurality of compounds with a second reactant; and 
optionally repeating said binary synthesis steps to produce said nucleic acid array-. 



1 63. The array of claim 151, wherein the nucleic acids are covalently 
attached to the support. , 

1 64. An array of nucleic acids, the array comprising: 
a planar support having at least a first surface; and 

a plurality of different nucleic acids attached to the first surface of the solid support at 
a density exceeding 10,000 different nucleic acids/cm2, wherein each of the different nucleic 
acids is attached to the surface of the solid support in a different known location, and has a 
different determinable sequence. 

165. An array of nucleic acids, the array comprising: 
a planar support having at least a first surface; and 

a plurality of different nucleic acids attached to the first surface of the solid support at 
a density exceeding 400 different nucleic acids/cm2, wherein each of the different nucleic 
acids is attached to the surface of the solid support in a different known location, has a 
different determinable sequence, wherein the surface and the support are made from different 
materials. 

1 66. The array of claim 151, wherein the different known locations are 
square in shape. 

167; The array of claim 151, wherein the substrate is glass. 

1 68. The array of claim 151, wherein the substrate is silicon dioxide. 

1 69. The array of claim 151, wherein the substrate is 
(poly)tetrafluoroethylene, (poly)vinylidenedifluoride, polystryene or polycarbonate. 
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1 70. The method of claiml 5 1 , wherein the substrate is optically transparent. 

171. The array of claim 151, wherein the substrate is functionalized with 
that attach to the plurality of different nucleic acids. 
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SUPPORT BOUND PROBES AND METHODS OF ANALYSIS 

USING THE SAME 



ABSTRACT 



The present invention provides methods and apparatus for sequencing, 
fingerprinting and mapping biological macromolecules, typically biological polymers. The 
methods make use of a plurality of sequence specific recognition reagents which can also be 
used for classification of biological samples, and to characterize their sources. 
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in accordance with Title 37, Code of Federal Regulations, Section 1.56. I claim foreign priority benefits under Title 35, United States 
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CA 


94305 




Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 3: 


FODOR 


STEPHEN 


P.A. 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 


Palo Alto 


California 


USA 






Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 


1120 Parkinson 


Palo Alto 


CA 


94301 




Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 4: 


READ 


J. 


Leighton 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 


Palo Alto 


California 


USA 






Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 


1001 Ramona Avenue 


Palo Alto 


CA 


94301 
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I farther declare that all statements made herein of my own knowledge are true and that all statements made on information and belief 
are believed to be true; and further that these statements were made with the knowledge that willful false statements and die like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 1 8 of the United States Code, and that such willful 
false statements may jeopardize the validity of die application or any patent issuing thereon. 



Signature of Inventor 1 


Signature of Inventor 2 


Signature of Inventor 3 


Michael C. Pirrung 


Lubert Sfcryer 


Stephen PA. Fodor 


Date 


Date Jun-c ^/ 2.000 


Date 




Signature of Inventor 4 




J. Leighton Read 


Date 
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DECLARATION AND POWER OF ATTORNEY 

As a below named inventor, I declare that: 

My residence, post office address and citizenship are as stated below next to my name; I believe I am the original, first and sole 
inventor (if only one name is listed below) or an original, first and joint inventor (if plural inventors are named below) of the subject 
matter which is claimed and for which a patent is sought on the invention entitled: SUPPORT BOUND PROBES AND METHODS 
OF ANALYSIS USING THE SAME the specification of which was filed on April 24, 2000 as Application No. 09/557,875. 

I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any 
amendment referred to above. I acknowledge the duty to disclose information which is material to the examination of this application 
in accordance with Title 37, Code of Federal Regulations, Section 1.56. I claim foreign priority benefits under Title 35, United States 
Code, Section 119 of any foreign application^) for patent or inventor's certificate listed below and have also identified below any 
foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed. 

Prior Foreig n Application(s) 



Country 


Application No. 


Date of Filing 


Priority Claimed Under 
35 USC 119 











^hereby claim the benefit under Title 35, United States Code § 1 19(e) of any United States provisional applications) listed below: 



Application No. 


Filing Date 







J claim the benefit under Title 35, United States Code, Section 120 of any United States applications) listed below and, insofar as the 
gfbject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by 
Jje first paragraph of Title 35, United States Code, Section 112, 1 acknowledge the duty to disclose material information as defined in 
p|tle 37, Code of Federal Regulations, Section 1.56 which occurred between the filing date of the prior application and the national or 
jpT international filing date of this application: 



Application No. 


Date of Filing 


Status 


09/056,927 


April 8, 1998 




08/670,118 


June 25, 1996 




08/168,904 


December 15, 1993 




07/624,114 


December 6, 1990 




07/362,901 


June 7, 1989 




08/348,471 


November 30, 1994 




07/805,727 


December 6, 1991 




07/492,462 


March 7, 1990 
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POWER OF ATTORNEY: As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this 
application and transact all business in the Patent and Trademark Office connected therewith. 

Joe Liebeschuetz, Reg. No. 37,505 
William M. Smith, Reg. No. 30,223 
Rosemary Celli, Reg. No. 42,397 
Ted Apple, Reg. No. 36,429 
Philip L. McGarrigle, Reg. No. 31,395 
Wei Zhou, Reg. No. 44,419 
Ellen Gonzales, Reg. No. 44,128 
Vern Norviel, Reg. No. 32,483 



Send Correspondence to: 


Direct Telephone Calls to: 


Joe Liebeschuetz 


(Name, Reg. No., 


Telephone No.) 


TOWNSEND and TOWNSEND and CREW LLP 


Name: 


Joe Liebeschuetz 


Two Embarcadero Center, 8 th Floor 


Reg. No.: 


37,505 


San Francisco, California 94111-3834 


Telephone: 


650-326-2400 



Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 


Inventor 1: 


PIRRUNG 


MICHAEL 


c. 




Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 


Citizenship: 


Chapel Hill 


North Carolina 


USA 




Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 


Address: 


102 Whistlingtree Court 


Chapel Hill 


NC 


27515 


Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 


Inventor 2: 


STRYER 


LUBERT 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 


Citizenship: 


Stanford 


California 


USA 




Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 


Address: 


843 Sonoma Terrace 


Stanford 


CA 


94305 


Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 


Inventor 3: 


FODOR 


STEPHEN 


P.A. 




Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 


Citizenship: 


Palo Alto 


California 


USA 




Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 


Address: 


1120 Parkinson 


Palo Alto 


CA 


94301 


Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 


Inventor 4: 


READ 


J. 


Leighton 




Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 


Citizenship: 


Palo Alto 


California 


USA 




Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 


Address: 


1001 Ramona Avenue 


Palo Alto 


CA 


94301 
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I further declare that all statements made herein of my own knowledge are true and that all statements made on information and belief 
are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code, and that such willful 
false statements may jeopardize the validity of the application or any patent issuing thereon. 



ijignaimc ux Luvcnior i 


Signature of Inventor 2 


Signature of Inventor 3 


Michael C. Pirrung 


Lubert Stryer 


Stephen P.A. Fodor 


Date 


Date 


Date 




Sign^^e^^^^^4^^^ 






J. Leighton^Read 
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DECLARATION AND POWER OF ATTORNEY 

As a below named inventor, I declare that: 

My residence, post office address and citizenship are as stated below next to my name; I believe I am the original, first and sole 
inventor (if only one name is listed below) or an original, first and joint inventor (if plural inventors are named below) of the subject 
matter which is claimed and for which a patent is sought on the invention entitled: SUPPORT BOUND PROBES AND METHODS 

OF ANALYSIS USING THE SAME the specification of which _X_ is attached hereto or was filed on as 

Application No. and was amended on (if applicable). 

I have reviewed and understand the contents of the above identified specification, including the claims, as amended by any 
amendment referred to above. I acknowledge the duty to disclose information which is material to the examination of this application 
in accordance with Title 37, Code of Federal Regulations, Section 1.56. I claim foreign priority benefits under Title 35, Umted States 
Code Section 119 of any foreign applications) for patent or inventor's certificate listed below and have also identified below any 
foreign application for patent or inventor's certificate having a filing date before that of the application on which priority is claimed. 





11 rtp£JHV<l.HVli^J^ 






Priority Claimed Under 




Country 


Application No. 


Date of Filing 


35 USC 119 













hereby claim the benefit under Title 35, United States Code § 1 19(e) of any United States provisional application(s) listed below: 



Application No. 


Filing Date 







Felaim the benefit under Title 35, United States Code, Section 120 of any United States application^) listed below and, insofar as the 
subject matter of each of the claims of this application is not disclosed in the prior United States application in the manner provided by 
£e first paragraph of Title 35, United States Code, Section 1 12, 1 acknowledge the duty to disclose material information as defined in 
Pjtle 37, Code of Federal Regulations, Section 1.56 which occurred between the filing date of the prior application and the national or 
PtT international filing date of this application: 



Application No. 


Date of Filing 


Status 


09/056,927 


April 8, 1998 




08/670,118 


June 25, 1996 




08/168,904 


December 15, 1993 




07/624,114 


December 6, 1990 




07/362,901 


June 7, 1989 




08/348,471 


November 30, 1994 




07/805,727 * 


December 6, 1991 




07/492,462 


March 7, 1990 
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POWER OF ATTORNEY: As a named inventor, I hereby appoint the following attorney(s) and/or agent(s) to prosecute this 
application and transact all business in the Patent and Trademark Office connected therewith. 

Joe Liebeschuetz, Reg, No. 37,505 
William M. Smith, Reg. No. 30,223 
Rosemary Celli, Reg. No. 42,397 
Ted Apple, Reg. No. 36,429 
Philip L. McGarrigle, Reg. No. 3 1,395 
Wei Zhou, Reg. No. 44,419 
Ellen Gonzales, Reg. No. 44,128 
Vera Norviel, Reg. No. 32,483 



Send Correspondence to: 


Direct Telephone Calls to: 


Joe Liebeschuetz 


(Name, Reg. No., Telephone No.) 


TOWNSEND and TOWNSEND and CREW LLP 


Name: Joe Liebeschuetz 


Two Embarcadero Center, 8 th Floor 


Reg. No.: 37,505 


San Francisco, California 94111-3834 


Telephone: 650-326-2400 





Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 1 : 


FODOR 


STEPHEN 


P.A. 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 












Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 












Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 2: 


STRYER 


LUBERT 








Residence 8c 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 












Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 












Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 3: 


PIRRUNG 


MICHAEL 


C. 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 












Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 












Full Name of 


Last Name: 


First Name: 


Middle Name or Initial: 




Inventor 4: 


READ 


J. 


Leighton 






Residence & 


City: 


State/Foreign Country: 


Country of Citizenship: 




Citizenship: 












Post Office 


Post Office Address: 


City: 


State/Country: 


Postal Code: 




Address: 











2 of 3 



Attorney Docket No.: 018547-043200US 



I further declare that all statements made herein of my own knowledge are true and that all statements made on information and belief 
are believed to be true; and further that these statements were made with the knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code, and that such willful 
false statements may jeopardize the validity of the application or any patent issuing thereon. 



Signature of Inventor 1 


Signature of Inventor 2 


Signature of Inventor 3 


Stephen P.A. Fodor 


Lubert Stryer 


Michael C. Pirrung 


Date aug \*m 


Date 


Date 




Signature of Inventor 4 






J. Leighton Read 


Date 
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